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Preface 


The chapters of this book deals with the basic formulation of waveguide cavity reso- 
nator equations especially when the cross sections of the guides and resonators have 
arbitrary shapes. The focus is on expressing the total field energy within such a cavity 
resonator as a quadratic form in the complex coefficients that determine the modal 
expansions of the electromagnetic field. Such an expression can then be immediately 
quantized by replacing the coefficients with creation and annihilation operators. 


The reviews of basic statistical signal processing covering linear models, fast al- 
gorithms for estimating the parameters in such linear models, applications of group 
representation theory to image processing problems especially the representations of 
the permutation groups and induced representation theory applied to image process- 
ing problems involving the three dimensional Euclidean motion group. Some atten- 
tion has been devoted to quantum aspects of stochastic filtering theory. The UKF as 
an improvement of the EKF in nonlinear filtering theory has been explained. 


The Hartree-Fock equations for approximately solving the two electron atomic 
problem taking spin-orbit magnetic field interactions into account has been discussed. 
In the limit as the lattice tends to a continuum, the convergence of the stochastic dif- 
ferential equations governing interacting particles on the lattice to a hydrodynamic 
scaling limit has also been discussed. Statistical performance analysis of the MUSIC 
and ESPRIT algorithms used for estimating the directions of arrival of multiple plane 
wave emitting signal sources using an array of sensors has been outlined here. It is 
based on our understanding of how the singular value decomposition of a matrix gets 
perturbed when the given matrix is subject to a small random perturbation. Finally, 
some aspects of supersymmetry and supergravity have been discussed in the light of 
the fact that supersymmetry is now a mathematically well-defined field of research 
that has opened up a new avenue to our understanding of how gravity can be unified 
with the other fundamental forces of nature. This book is based on the lectures deliv- 
ered by the author to undergraduate and postgraduate students. These courses were 
on transmission lines and waveguides and statistical signal processing. 
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Chapter 1 


Remarks and Problems on 
Transmission Lines and 
Waveguides 


[1] Study about the historical development of the Maxwell equations for elec- 
tromagnetism starting with the experimental findings and theoretical formula- 
tions of Coulomb, Ampere, Oorsted, Faraday, Gauss and finally culminating in 
Maxwell’s introduction of the displacement current to satisfy charge conserva- 
tion in time varying situations. Study about how Maxwell converted all these 
findings into laws expressible in the form of partial differential equations based 
on the basic operations of vector calculus and how by manipulating these equa- 
tions, he proved that electric and magnetic fields propagate in vacuum as plane 
waves travelling at the speed of light and thereby how he unified light with elec- 
tricity and magnetism. Study about how Heinrich Hertz confirmed Maxwell’s 
theory hundred years later using Leyden jar experiments. 


Max Planck struggled for over twenty years to finally arrive at his law for 
the spectrum of black body radiation. The earlier law for this spectrum that 
was being used was Wien’s displacement law according to which the spectral 
density of black body radiation was proportional to 


S(v) = Cv? .exp(—Bv) 


With 6 = A/T with A a constant. At very low frequencies this law states 
that the spectral density is proportional to v?. The same is true at very high 
temperatures. At very low temperatures, this law predicts that the spectrum 
will vanish, ie, there will not be any radiation at all. The high temperature 
and low frequency limit of Wien’s displacement law was in sharp contradiction 
with experiment. Planck used a little of statistical mechanics but more of curve 
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fitting to modify Wien’s displacement law to 


Cv? 
a a exp(Av/T) — 1 


This law has the same low temperature and high frequency behaviour as Wien’s 
displacement law but at high temperatures or at low frequencies, while Wien’s 
law predicts a Cv? dependence of the spectrum of black body radiation, Planck’s 
law gives a behavour CTv? which is in agreement with the experiments con- 
ducted by Rubens and Kurlbaum. Planck by fitting this curve to the experi- 
mental curve of Rubens and Kurlbaum arrived at the formula A = h/k where 
k, is Avogadro’s number and h is called Planck’s constant. Planck later on gave 
the following derivation of his radiation law: He assumed that radiation energy 
comes in quanta of hv, ie, in integer multiples of hy via harmonic oscillators. 
When a harmonic oscillator of frequency v is excited to the n“” energy level, 
it acquires an energy of nhyv and by Boltzmann’s relation between energy and 
probability, the probability of such an oscillator getting excited to the n‘” level is 
proportional exp(—nhv /kT). Hence, Planck concluded that the average energy 
of an oscillator of frequency v is given by 


endo Mhw.exp(—nhv/kT) 
endo exp(—nhv /kT) 

- hv 

~ exp(hv/kT) —1 


Next, Planck used familiar method of Rayleigh to derive a formula for the total 
number of oscillators having frequency in the range [v, v + dv] and belonging to 
the volume spatial V. This number is given by 


U(v) = 


i: d°qd° p/h? 

qe V,pc€ [hv ,h(v+dv] 

Where he used Einstein’s energy-momentum relation EK = pc for photons which 
have zero mass. Taking into account that the photon has two independent 
modes of polarization, ie, perpendicular to its direction of propagation, this 
number evaluates to 


d 
ws f 8rp*dp/h? = dv. Venv3 /303 = dv.V.8rv" /c3 
dv p<hu/c dv 


Multiplying this number with the average energy of an oscillator gives us the 
average energy of black body radiation in the frequency range [v,v + dv]: 


8rhv?/c? 


S(v)dv = eapthv/kT) = 1 V 


This is the famous Planck’s law of black body radiation and its advent was the 
starting point of the whole of modern quantum mechanics and quantum field 
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theory. First Newton came who unified gravitation which causes the apple to 
fall with Kepler’s laws of planetary motion by proposing his celebrated inverse 
square law, he invented calculus along with Leibniz in order to establish Kepler’s 
laws of motion from his inverse square law of gravitation. More precisely, Robert 
Hooke who was the curator of the Royal society posed to Newton the inverse 
problem: What radial force law of attraction should exist between the sun and 
a planet in order that the planet move around the sun in ellipses satisfying 
Kepler’s laws ? Newton solved this inverse problem by inventing calculus and 
formulating his second law of motion in terms of differential calculus which 
he called fluxions. He applied this law to the sun planet system and proved 
that when the force of attraction between the two is the inverse square law 
then the planet is guaranteed to revolve around the sun in an ellipse. It is a 
little unfortunate that Robert Hooke’s name does not appear prominently in 
Newton’s magnamopus “Philosophae Naturalis Principia Mathematica’ which 
is the Latin translation of “Mathematical Principles of Natural Philosophy”. 
Today we believe that some portion of the credit for the discovery of the inverse 
square law of gravitation should go to Robert Hooke. 

After Newton, the next major unification in physics came with Maxwell 
when he created the four laws of electromagnetism based on the findings of 
Coulomb, Gauss, Ampere, Oorsted and Faraday and using these laws predicted 
that electricity, magnetism and light are one and the same phenomena which 
appear distinct phenomena to us primarily because of the frequencies at which 
these propagate. 


[2] The rectangular waveguide:Expressing the transverse component of the 
electromagnetic field in terms of the longitudinal components. 

[a] A rectangular waveguide has dimensions a,b along the x and y axes 
respectively. Assume that the length of the guide is d. When the fields have the 
sinusoidal dependence exp(—jwt) and dependence upon z as exp(—7yz), then 
0/Ot and 0/0z get replaced respectively by multiplication with —jw and —y. 

Hence, the Maxwell curl equations in the w — x — y — z domain are 


curlE = —jwyuH, curlH = jweE 


which in component form become 


Ezy + VEy = —jwpe,-— —-——-— (1) 
—YEy — Ezix = —jwpHy, eee ee SMe: (2) 
By. — Bry = —jwell, — ~~~ — (3) 


and likewise by duality with E — H,H — —E, ¢€ > pp, > €. Write down the 
dual equations: 
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Solve (1),(2),(4),(5) for {Ez, Ey, Hz, Hy} in terms of (E22, Ezy, Hz,2,Hzy} and 
show that this solution can be expressed as 


B= (=7/8?)VLEs — (jau/h?)V LH. x 2——— (1) 
Hy = (—7/h)V iH. + (jwe/h?)V iB, x 2- - — (8) 
where 
Vi = %0/dx + §8/dy, 
BE, = £,%+ Eyy,H, = A,t + Hyg, 
ay ture 
[3] 


[a] Show that all the procedures and expressions in Step 1 are valid even when 
€, are functions of (w,2,y) but not of z. Assuming e, to be constants, show 
that by substituting (7) and (8) into (3) and (6) gives us the two dimensional 
Helmholtz equation for E,, H,: 


(Vi +h? )E, =0,(Vi +h?)H, =0-—--(9) 


[4] 

[a] Show using (7) and (8) that the boundary conditions that the tangential 
components of E and the normal components of H vanish on its boundary walls 
are equivalent to the conditions 


E,=0,7 =0,a,y = 0,0, 0H, /0x = 0,2 = 0,a, OH, /dOy = 0,y = 0,0 


Hence by applying the separation of variables method to (9) deduce that the 
general solutions are given in the frequency domain by 


E(w,2,y,)= S) e(m,n,w)tmn(@, y)exp(—Imn(w)2); 


m,n>1 


H,(w,x,y,) = De d(m, n,W)Umn(X, y)eLP(—Ymn(w)z), 


where 
Un,n(£,y) = (2\V/2/Vab).sin(mna/a).sin(nry/b), 
Umn(2,y) = (2V2/Vab).cos(mma/a).cos(nty/b), 
Ym,n(w) =v Waa an Ww? HE, 
where 


han = ((mnm/a)? + (rr/b)?)/ 
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where m,n are positive integers. Show that 


a b 
/ | thins Yip gl, y)dedy = bmp5n 
0 0 


and deduce that if Re(¥mn) = Qmn, then 
a b d 
| i | |E.|°dadydz = ye |e(m, n, w)|2(1 — exp(—2amn(w)d) /2amn(w) 
0 Jo Jo wide 
Likewise evaluate 


| \HePaedyd, [ |B, Paedyde, | \H,|Pdedyd: 


and hence evaluate the time averaged energy density in the electromagnetic field 
at frequency w: 


a b pd 
u=/s) fff (ee e,2,y,2)P + ull, y,2)P)dndyd: 
0 Jo Jo 
Question: Why does the 1/4 factor come rather than the 1/2 factor ? 


[5] 

[a] Calculate the power dissipated in the waveguide walls assuming that the 
region outside has a finite conductivity 0? 

Solution: The surface current density on the wall x = 0 is 


J5(0,y, 2) =ix H(0, y, z) = HAO, Go 2e _ EL AOL; z)y 


This is the current per unit length on the wall. It can be attributed to a current 
density in the infinite region beyond this wall into the boundary having a value 
of J(x,y, z),z < 0 provided that we take 


0 
/ J(z,y,z)dx = J,(0, y, 2) 


However from basic electromagnetic wave propagation theory in conducting me- 
dia, we know that 


I(x, y,z) = S(O, y, z)eap(qo), 0 = V Juw(o + jwe) 


Thus, 
J(0, Y; z)/Yo Pal J (0, Y, z) 


and hence the average power dissipated inside the region « < 0,0 << y<b,0< 
z < dis given using Ohm’s law by 


p= [ [wen aP rovaeaye: 
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b pd 
= (f | Fs(Oru. 2) /RalroFdyde).( exp(2agx) dx 


b pd 
=(f f(a...) /Aaoolrol?))dvde) 
0 Jo 
where 
ao = Re( yo) 

Repeat this calculation for the other walls. 

(6 

[a] Show that if the waveguide has an arbitrary cross section, in an arbitrary 


orthogonal coordinate system (qi, q2) for the x — y plane, equations (7) and (8) 
can be expressed as 


Ey = (—9/h?G,)OE./0q1 — (jw /h?G2)0H./0q2 
Ey = (—7/h®?G2)0E./Oq2 + (jw /h?G1)OH./0q2 
Ay = (—7/h?G1)OH./Oq1 + (jwe/h?G2)OE./q2 
Hy = (—7/h?G2)OHz/Oq2 — (jwe/h?G,)OE./Oq2 


where G1,G2 are the Lame’s coefficients for orthogonal curvilinear coordinate 
system (q1, q2); ie, 


G1 = V(0x/0q1)? + (Ay/Ou)?, 
Gy = V/(Ox/0q2)? + (y/Oa2)?, 


and 
E, (w,q1, 92,2) = Ey (w, 1, G2, 2) G1 + Eo(w, 11, 92, 2) 42, 


Hi, q1, 42; 2) = AM(w,n, q2, z)q T Ho(w, 71, 492; 2) qa, 


define the curvilinear components of E,; and H, respectively. Show that com- 
bining these equations with the 2 component of the Maxwell curl equations 
results in the two dimensional Helmholtz equation in the curvilinear system: 


1 O G2 OE, rs O G, OE, 
G,G2 On G, On Oq2 Go Oq2 


)+h?E, =0 
and the same equation for H,. 


[b] Show that the boundary conditions on the conducting walls in the curvi- 
linear case, assuming that the boundary curve of the waveguide cross section is 
given by qi = c = constt assume the forms 


OH, 
£,=0,m =¢,,7— =0,m =c 
On 
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Deduce that in general, the modal eigenvalues h? are different in the TE (H, = 
0) and the TM (EF, = 0) cases. They are the same only in the rectangular 
waveguide case. 


[c] Deduce an expression for the total time averaged power at frequency w 
dissipated in the waveguide’s conducting walls assuming that the region exterior 
to the guide has a constant conductivity of o. 

hint: The surface current density is 


J5(w, q2, 2) = a x Hi (w,c, q2, 2) 


and hence by the same reasoning as in step 3, we have that the volume current 
density in the conducting exterior satisfies 


V7I(w, 1, 92; z) _ yo(w)?I(w, 1, 92; z) =0 


yo(w) = V jup(o + jwe), 
/ J(w, 01, 92, 2)G1(q1, G2)dq, = Js (w, G2, 2) 


An approximate solution to this corresponding to the situation when the field 
propagates only along the q direction in the conducting region is given by 


qd 
J(w, G1 2,2) = Yo(w)Is(w, G2, z)erp(—o(w) |  Gr(ai, G2) dq) 
c 
Note that this situation corresponds to the fact that the fields propagate from 
the surface of guide into the depth of the conducting walls normally, ie, along 
the direction q, into it and the fact that in propagating from q; = c to q, 
normally, the distance covered is 1 = f Gi(q1,q2)dqi. Then the average power 
dissipated per in a length d of the guide at frequency w is given by 


fore) A d 
Puss = (1/20) f | | |J(w, a1, 42, 2)|’Ga (a1, g2)G2(q1, G2) dqu dqodz 
c 0 0 


where when q2 varies over [0, A] one full curve on the cross section is covered. 
Note that q2 is tangential to the waveguide boundary curve for any cross section. 


Specialization to cylindrial guides: ; 

[a] In Step 4, choose q, = p = \/x? 4+ y?,q = 6 = tan 1(y/x)Show that 
(p,@) form an orthogonal curvilinear system of coordinates in the zy plane and 
that 


Gy Gp 1, Go G4 p 


so that equations (7) and (8) assume the forms 


OH, 
de 


jwp/h? p) 


E, = (<1) 2 


0 
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OE: OH, 
Ey = (—7/h*p) ag + (jwp/h?) Dp 
aH, OE. 
Hy = (—a/12) G2 + wel) Ge 
OH, OE: 
Hy = (rh) — (Gwe/ 0) Se 


[b] Substituting the above into the z component of the Maxwell curl equa- 
tions then gives us the two dimensional Helmholtz equation for E,,H, in the 
plane polar coordinate system: 


10 dB, 1 @E8, 
pop” ap | p2 Oe 


+A7B,=0 


and the same equation for H,. The boundary conditions are given by 


OH, 


E, =0,p = Rk, —— 


=0,p=R 


and these solving them by the method of separation of variables with the appli- 
cation of the appropriate boundary conditions then gives us the general solutions 


Ez(w, p, 9,2) = [Ji (a (n)p/R)(c1(w,m, n)cos(me)+ 


mn co(w, m, n)sin(md) )exp(—7En(w)z)] 


H.(w,p,,2) = S [Im (8m(n)p/R) (di (w,m, n)cos(m@)+ 
m,n d2(w,m,n)sin(mé) exp(—74,(w)z)] 


where Q,(n),n 


= 1,2,... are the roots of J,(a) = 0 while 8,,(n),n = 1,2,... 
are the roots of J/, ( 


x) = 0 and further 

h=Ahimn = Am(n)/R 
in the TM case (E, # 0, H, = 0) while 
h = Pnn = Bm(n)/R 


in the TE case (E, = 0,H, #0). Further 


Ven (w =a Oiat n)? | R —w? LE, 


VE n(w = VBm(n) pie _ Ww? [We 


are the propagation constants for TM), and TE, modes respectively. 


Exercises: 
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[1] Show using the separation of variables applied to the two dimensional 
Helmholtz equation that J,,(x) actually satisfies the Bessel equation 


x? J (a) + aJi,(a) + (x? —m?)Jm(ax) = 0 


[2] Show that if 


fimn(p) = Jm(Am(n)p/R), Im (Am(n)) =0 


then 
0” fmn(P) + Pfimn(P) + (am(n)?p?/R? — m?) fn(p) = 0 


and hence prove the orthogonality relations 
R 
[Ino (0)0/ BR) Jro(am(K)p/R)pdp = 0.1 # k 
0 
Hint: Multiply the above differential equation for fimn(e) by fmk(p)/p, imter- 


change n and k, subtract the second equation from the first and integrate from 
p=0to p=R. Use integration by parts to deduce the identity 


R 
(01m (11)? — om ()?) | efron) frax(p)de = 0 


[3] Repeat Exercise [2] with a,,(n) replaced by 8,,(m) where now J! (6,(n)) = 
0. 


[4] Prove the orthogonality relations 


Qn 20 
| cos(m@)cos(nd)dd = / sin(md)sin(nd)dd =0,m An 
0 0 


and 


20 
| cos(m@)sin(nd)dé = 0,Vm,n 
0 


[5] Using Exercises [2], [3], [4], deduce that the functions 
Jm dm (n)p/R)cos(md), Jn (Cm(n)p/R)sin(md),m,n = 1,2,.. 


are all mutually orthogonal on the disc of radius R, w.r.t to the area measure 
p.dp.d@ and likewise for the functions 


Jm(Bm(n)p/R)cos(mé), JIm(8m(n)p/R)sin(md), m,n = 1, 2,... 
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[6] Use the result of Exercise [5] to show that when 


Ez = Yom (Am(n)p/R)(cr(w,m, n)cos(md)+e2(w,m, n)sin(md))exp( Yn (w)2)] 


then 
R Qr 
| | |E.|"p.dp.db = 
0 Jo 
S= Am, n) (ler (w, m,n)? + |e2(w, m, n)|?)exp(—2aF,,, (w)z) 

where 

oF nw) = Re(Ymn()) 
and 


Ri p2r 
A(m, n) =| | Jm(am(n)p/R)* cos? (md) p.dp.dd 
Ri paar 
= [| tn lornlnyo/R)*sin? (mo)p.dp.de 


R 
=e ‘i Jn (m(n)p/ RY? p.dp 


Likewise, show that for 


Hz = S“[Jm(Bm(n)p/R)(di (w, m, n)cos(md)+d2(w, m, n)sin(me) )exp(—Yin(w)2)] 
we have 


R Qn 
i ; |H.|p.dp.dd = > lm, n) (di (w, m,n) ?-+[do(w, m,n) 2)exp(—204,, (w)z) 
o Jo ae 


where 2 
p(m,n) = | Jm(Bma(n) 0) R)? p.dp 
0 


[7] Prove that 


R 
| I sed) pi RYT. (Ga (Bip) Ripa 02k 


hint: Integrate by parts in two different ways and substitute for the second 
derivatives using Bessel’s equation. 


[8] Repeat [7] with a(n) replaced by B,,(n). 
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[9] Using the results of the previous Exercises and the expressions for EF, 
and H, in terms of E,,H, to express the time averaged energy in the electric 
field as 


Ug = (e/4) | |E|? p.dp.dd.dz 
[0,R] x (0,27) x [0,d] 


= (€/4) ih (\E.|? + |B. 2)p.dp.dé.dz 


S"[pe(w,m,n)(ler(w, m,n)? a |co(w,m, n)|?) 
+¢e(w,m,n)(|dy(w,m,n)|? + |d2(w,m, n)|*)] 
and that in the magnetic field as 


On = (u/4) | |H|*p.dp.do.dz 
(0,R] x [0,277) x [0,d] 


= (ua) f (Ht? + |H_,|?)p.dp.dd.dz 


So [pa(w,m, n)(\di(w,m, n)|? oF |do(w,m, n)|?) 


+H (w, m, n) (ler (w, m, ni) a leo(w, m, n)|)] 


where pr(w, m, n), qE(w, Mm, n), qE(w, Mm, n), qu (w, ™m, n) depend only On €, /L, R, d 
as parameters. 


Quality factor 

[a] The quality factor of a guide is defined as the ratio of the average energy 
stored per unit length of the guide to the energy dissipated per unit length per 
cycle. 

Exercise 

Compute the quality factors for rectangular and cylindrical guides for spec- 
ified modes TE and TMinyn. 

[b] For a rectangular guide, the wavelength of propagation along the z axis 
for the TE, or the TMi, modes when the frequency is more than the cutoff 
frequency is given by 


A = 20/Bmn; bmn = J¥mn = Jw pe — h2.,,, 


with 
hinn = (ma/a)? + (na/b)? 


The phase velocity is given by 
Uph = VA =WA/2@T = W/Bmn = w/w? We — h2 


This is greater than the speed of light! The phase velocity is therefore not 
a meaningful measure for the velocity of energy transfer. A more meaningful 
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measure is the group velocity which is based on the following observation. Let 
a wave field travelling along the z axis be a sum of two harmonic components 
with a small frequency difference and a small wavelength difference. Thus, it 
can be expressed as 


f(t, z) = cos(wt — kz) + cos((w + Aw)t — (k + Ak)z) 
This can be in turn expressed using a standard trigonometric identity as 
f(t, 2) = 2cos((w + Aw/2)t — (k + Ak/2)z).cos(Awt — Akz) 


The second cosine term represents the slowly varying (in space and time) enve- 
lope of the wave while the first cosine term represents the sharp variations of 
the signal within the envelope. The velocity of energy transfer is measured by 
that of the envelope and its is given by 


Ug = Aw/Ak 


which in the limit becomes 
Ug = dw/dk 


This called the group velocity. In our case, we find the group velocity is 
ug(m,n) = dw/dBmn = (dBimn/du)~* = 
we —h2, /dw) + = 
(ew /Bmn)~* = Bn / pew 
= V1/pe — h2,,/(mew)? < 1/VpE 


Thus the group velocity is smaller than the velocity of light and is therefore 


a more meaningful measure of the velocity of energy transfer for the (m,n)!” 
mode. 


Energy density in a guide of arbitrary cross section. 
[a] Let un(qi,q2) and —h2 be the eigenfunctions and eigenvalues of the 
Dirichlet problem 


(Vi. als h2)un(q, q2) = 0, Un (a, G2) =0 


and let vn(q1,¢2) and and —k? be the eigenfunctions and eigenvalues of the 
Neumann problem 


OUn (c, q2) 


=0 
On 


(Vi +2 )un(q, a) = 0, 


Recall that in the orthogonal coordinate system (q1, G2), we have 


; bY gO C0 BG DB 


Vi= + 
7 G1Go On Gy On Oq2 G2 Oq2 
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Exercise: Assuming that the pee are all distinct, show that the u/,s are all 
orthogonal: 


| Un (G1, 92)Um (M1, 42)G (G1, G2) G2(q1, G2)dqidq2 = 0,n Am 
D 
and likewise assuming that the k2's are all distinct, show that 


| Un (G1; 92)Um (G1; 92) G1 (1 92) GoM, G2)dqrdqg = 0,n Am 
D 


where D is the cross-section of the guide parallel to the xy or equivalently qiq2 
plane. 
hint: Use Green’s theorem in the form 


| (tin V2 Um =. Um V7. Un)G1Gedqidqs = 
D 


OuUm OUn 
n~—- — Um ~— )God 
[ On bs Dai wae 


where I is the curve bounding the cross section D and is defined by the condition 


qi =c,z =0. Show that the general solution for the longitudinal component of 
the electric field and the magnetic fields can be expressed as 


E,(w, 1, 92,2) = oe c(w, 2)Un (q1, 92)exp(—In, (w)2); 


n 


Hw, M1; 92; z) at ye d(w, N)Un(Q; qo )exp(—7# (w)z) 


n 


7B (w) = Vi — Phe, 


ow) = 2 — oP pe 


where 


Hence by using formulae of step 4, calculate the transverse curvilinear compo- 
nents of the electromagnetic field. Now prove that the functions V, un(q1, q2),n > 
1 are also orthogonal, ie, 


J (Vat, V stm) Gr Cadence =0,n#m 
and likewise for V_.v,. For this simply apply Green’s formula in the form 


| (UnVitUm + (Vitin, ViUm))G1Gedqidgs 
D 


= [ (ndtm/8an) Codes =0 
r 
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since u;, = 0 on. In the case of v,, use the same formula but with the boundary 
condition Ov,,/0q, = 0 on T. Also show that 


| (atm Pitin)GrGodardas = he f WG ,Gadarday = ho 
D D 


[ve Fivm)GGrdndae =f v2G,Godqdas = ke 
D 


assuming that the u/,s and v},s are normalized. Finally, prove the orthogonality 
of Vin xX 2Z,n > 1 and of Viv, x Z,n > 1. Also prove the mutual orthogonality 
of Vuyn, and Vi vm x Z and of Viun x Z and Vu,. For this, you can use the 
identities 


((Vitn X 2),(Vium x 2)) = (Viun, Vitm) 
and likewise for v, and further with 
dS = Gy, G2dqdqz2 
observing that 
Vif= a fad 5 ata 


that 
i (ViUn, ViLUm x Z)dS 
D 
ap Z.(ViLUn x Vivm)dS 
D 


= i, (Un,1Um,2 — Un,2Um,1)dgqidq2 
D 


= | ((UnUm,2) 1 = (UnUVm,1),2)dqidq2 =0 
D 


again by applying the two dimensional version of the Gauss divergence theorem 
to the function (WyUm,2, —UnUm,1) With the boundary condition that un vanishes 
on’. In this way using the formulae of step 4, show that the total average energy 
of the electromagnetic field in the guide can be expressed as 


= (e/4) | (|E.|? + |E1|?)dSdz 
Dx(0,d] 


+(j1/4) i) (\H.|? + |HP?)aSdz 
[0,d] 


= VAw, nlc, n)? + ww, n)|d(w, n))?) 


n 


where A(w,n), (w,n) are determined completely by h?,k2,d where d is the 
length of the guide. 
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Exercise: Calculate the explicit formulae for \(w,n), p(w, 7). 


[7] 

Cavity resonators. 

[a] Take a rectangular wave guide of dimensions a,b along the x and y axes 
respectively and d along the z axis. Cover the bottom z = 0 and the top z =d 
with perfectly conducting plates. We then get a rectangular cavity resonator 
which is a cuboid with a perfectly conducting boundaries. By applying the 
waveguide equations of step 1, we get that the exp(—yz) dependence may be 
replace with any linear combination of exp(+yz) and we must choose this linear 
combination so that H, vanishes when z = 0,d and E,z, Ey also vanish when 
z =0,d. It should be noted that the multiplication by —y that replaces 0/0z 
cannot be done here since we could multiply either by +y. This means that the 
cavity resonator case, equations (7) and (8) must be replaced by 


E, = (fh?) vB. (jwy/h?)V LH, x 2--——(7') 


H, = (1/h?) 2-H. + (jwe/h?)V Ez xX Z--- (8’) 


It should be noted that even in the waveguide case, there are two solutions 
for y namely +,/h2,,, — we and we choose that linear combination of the 
corresponding exponentials so that with E, defined by (7’), we get that BE, 
along with H, vanishes when z = 0,d. This conditions are equivalent to H, 
and 0E,/0z vanishing when z = 0,d. Note that FE, vanishing when z = 0,d is 
equivalent to ZV th, =Vai pa vanishing when z = 0,d. Thus we must choose 
= jmp/d for some integer p and the combination sin(apz/d) = (exp(yz) — 
exp(—yz))/27 for H, and cos(mpz/d) = (exp(yz) + exp(—yz))/2 for E,. A little 
speculation will show that this is valid for cavity resonators of arbitrary cross 
section in the xy plane. 


[8] 

Exercises 

[1] This problem tells us how to analyze the waveguide fields in the presence 
of a gravitational field which is independent of (t,z) and described in general 
relativity in terms of an appropriate metric tensor. 

Assume that the metric of space-time is diagonal with the coefficients inde- 
pendent of t, z. Thus, the metric has the form 


dr* = goo(x, y)dt? + gi1(x, y)da* + goo(2,y)dy” + g33(a,y)dz* 
Write down explicitly the components of the Maxwell equations 
Puc 2 Pyoyy 2 Pou = 0, Cae —(1) 


(PENH G)e =0 ===) 
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in this background metric assuming the dependence on (t, z) to be of the form 
Fup(t,2, 9,2) = Hyv(a, y)exp(—jeot — y(ve)2) 
Identifying equation (1) with the homogeneous Maxwell equations 
curlE + jwB = 0, divB = 0 


identify the vectors E, B in terms of the components of H,,,. Now write down 
the other Maxwell equations in free space (2) in terms of the H,,, and hence 
in terms of E, B and solve for E,, Ky, Bz, By in terms of F,, B, and derive the 
generalized two dimensional Helmholtz equations satisfied by E,,B,. In case 
inhomogeneous permittivity e(w, x,y) and permeability u(w,x,y) are also to be 
taken into account in equation (2), first identify (2) with the Maxwell equations 


divD = 0,curlH + jwD = 0 


and hence determine D, H in terms of the components of F“"”. Thus, state how 
the vacuum medium relation 


Fe — gt g’? Fg 


gets modified in the presence of the inhomogeneous medium. Derive therefrom 
the relationship between D,H and E,B in the inhomogeneous medium in the 
presence of this non-flat diagonal metric. Obtain thus the modified generalized 
Helmholtz equations for F,, H, in this inhomogeneous medium in the presence 
of the above gravitational field. Generalized this theory to the case of orthogonal 
curvilinear coordinates q = (qi,q2) in the x-y plane, ie, by writing down the 
metric as 


dr? = goo(q)dt? + g11(q)dq? + 922(a)das + gi2(q)dqidge + 933(q)dz” 


[2] Show that the most general solution for the electromagnetic field within 
a cavity resonator of arbitrary cross section in the zy plane with length d along 
the z axis is given by 


E(t, ,@,2) = > Un (qi, q2)-(2/d)'/?.cos(mpz/d).Re(c(n, p)exp(—ju(n, p)t)) 


n,p2l 


H, (t, 1; 92, z) = S- Un (M1, qz).(2/d)\/?.sin(mpz/d).Re(d(n, p)exp(—jw(n, p)t)) 


n,p2l 
E. (¢,%, 92,2 =. hat (—mp/d).V 1n(q1; @).(2/d)/?.sin(prz/d).Re(c(n, p).exp(—jw” (n, p)t)) 


n,p 


— So kn? (V 1m) (a1, G2) 4).(2/d)/?. sin(pr2z/d).Re(juw" (n,p)d(n, p).exp(—ju" (n, p)t)) 


n,p 
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H(t, a1, 92, 2) 
- oe k>? (rp/d).V 1 0n(q1, @2).(2/d)/?.cos(prz/d).Re(d(n, p).exp(—jw™ (n, p)t)) 


n,p 


f SS h,?.(Viun)(n, 92) 
ae x 2).(2/d)'/?.cos(pmz/d).Re(jew™ (n, p)c(n, p).exp(—jw” (n, p)t)) 


where the notation of step 6 has been used. Note that the characteristic E-field 
and H-field frequencies of oscillation are respectively given bu 


w” (n,p) = (ue) /? Sh? + (pr/d)?, 


(n,p) = (ue)? /k2 + (pr /d)?, 


To see how these expressions arise, simply use the waveguide formula 


wpe + YP (w)? = hk? 


for the transverse magnetic field situations and 


wpe + ff (we) =k? 


for transverse electric field situations, combined with the resonator formula (ob- 
tained by applying the boundary conditions on the z = 0,d surfaces 


VE (w)? = —(mp/d)? = yf (wy? 


You must apply the formula 


E, = (1/h*) z = Viz: 


) 
H, = (</k) = GVLEz x2 
for transverse magnetic fields, and 


By = (u/)2 Vi, KZ 


Hy = (1/h?) = @ <VH. 


for transverse electric fields and then aie is superposition principle, namely 
that the total electric field is the superposition over transverse magnetic modes, 
ie, with H, = 0 and the transverse electric modes, ie, with E, = 0. These 
are derived from the formulas of step 1 by replacing —y by 0/0z and —jw by 
0/Ot. This is required since the general form of the cavity fields consists not of 
one mode and one frequency but is rather a superposition over all the modes 
and cavity frequencies. In other words, for cavity fields we must rewrite the 
waveguide formulas in the space-time domain (2, y, z,t), rather than in the 2-D 
space and frequency domain (x, y,w). 
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[3] Determine an expression for the total energy per cycle dissipated in a 
cavity resonator of arbitrary cross section for a TM, mode and for a TE,» 
mode. By a T'M,,,, mode, we mean the electric and magnetic fields derived from 


Hi, _ 0, E, = Un(M, q2)(2/d)'/?cos(mpz/d).Re(c(n, p)exp(—jw™ (n, p)t) 
and by a T’E,,,, mode, we mean the electric and magnetic fields derived from 


E, =0, Az = vn(qu, qz)(2/d)!/?.sin(xpz/d).Re(d(n, p).exp(—jw” (n, p))) 


Chapter 2 


Remarks and Problems on 
Statistical Signal Processing 


[1] Construct the Lattice filter order recursion for an R’-valued vector sta- 
tionary stochastic process X(t),t € Z by minimizing 
P 


S[l| X(t) + $7 ACR) X(t — &) [17] 


k=1 


with respect to the M x M prediction coefficient matrices A(k),k = 1, 2,...,p. 

hints: Setting the variational derivatives of the above energy w.r.t the A(m)'s 
to zero gives us the optimal normal equations in the form of block structured 
matrix equations 


P 
R(m) + 5 Ap(k)R(m — k) = 0,m = 1,2,...,p 
k=1 
where 
R(m) = E(X(t) X(t — m)") ce RYxM 
Note that 


Note also that the optimal prediction error covariance is given by 
R.(p) = BbbE(ep(t)ep(t)”) 


where 


19 
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and by virtue of the orthogonality equations or equivalently, the optimal normal 
equations, 


Note that the minimum prediction error energy or p‘” order is 
E(p) = Tr(Re(t)) 


Now write down the optimal equations in block matrix structured form and 
apply the block time reversal operator J, consisiting of a reverse diagonal having 
blocks Iyg and with all the other blocks being zero matrices. Note that 


JpRpJIp = Sp 


where 
Ry = ((R(k — m)))i<km<p 


and 
Sp = ((R(m — k)))i<km<p 
To get at an order recursion, consider a dual normal equation with A(k) replaced 


by A(k) and R(k) by S(k). 


[2] Consider the RLS lattice algorithm for the multivariate prediction in both 
order and time. How would you proceed ? 
hint: Define the data vector at time N as 


X(N)? 
y= RSS) eRe net 


x(0)? 
and define the data matrix at time N of order p by 
Xp = [27K 27-2Xn, 2 PK] EC RAY XM 
The optimal matrix predictor at time N of order p is then given by minimizing 


|| Xv + Xv pAwp ll? 


where || . || denotes the Frobenius norm and 
A,(1)" 
An,p — A,(2)” € RMpxM 
A,(p)" 


Question: Identify the appropriate Hilbert space for which this problem can 
be formulated as an orthogonal projection problem. 
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[3] Calculate the autocorrelation function of the electromagnetic field inside 
a waveguide of arbitrary cross section assuming that at the feedpoint, namely 
at the mouth of the guide at the z = 0 plane, the correlation of the E,, H, fields 
are known. 

[4] How would you develop an EKF for estimating the electromagnetic field 
in space time over a bounded region from noisy measurements of the same at 
a finite discrete set of spatial pixels when the driving current density field is a 
white Gaussian noise field in space-time ? 

hint: Write down the wave equation for A in the form 

(V? — (1/c?)d7) A(t, r) = —pJ(t,r) 


Transform it into two first order in time pde’s and by spatial pixel discretization, 
cast it in state variable form. Now, use the fact that the electric and magnetic 
fields can be expressed in a source free region as 


t 
E(t,r) = -2 | V x (V x Ajdt 
0 


Bit,r) =VxA 


to arrive at a measurement model for 0E/0t, B at a discrete set of spatial pixels. 
Apply the EKF to this. 


[5] Consider the problem of estimating the moments of a vector parameter 
that modulates a set of potentials for a quantum system. The Hamiltonian is 
thus of the form 


H(t,0) = Ho + % 6(k) Vz (t) 
k=1 


The objective is to estimate the moments of the parameters 
bp(ki, ..., kp) =< 0(k1)...0(kp) > 
Schrodinger’s equation for the wave function is 
i'(t) = H(t, 0)b(t) 
and it has a Dyson series solution 


w(t) = Uo(t)o(0)+ 


Co 


pr Uo(t—t)V (tr, O)oU (ti—ta)V (ta, 8) 
nal! 0<tr<...ti<t 
where ..Uo(tn—1—ta)V (tn, 9)Uo (tn) (0) dty...dtn 


Uo(t) = exp(—itHo), V(t, 0) = S— 0(k)Vi(t) 
k=1 
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The quantum plus classical average of an observable X after time t is given by 
< X > (t) =<< y(t)|X|Y) >> 
which can be expressed in the form 


<X>(t)h=<X>0(t)+ Yo bnlkiy kn) F(t ky kn) ---—-(1) 


where 
< X Do (t) =< Uo(t)(0)|X|Uo(t)v(0) >=< Y(0)|Uo(t)* XUo(t)|v(0) > 
Exercise: Derive an explicit formula for F(t,k1,...,4n) in terms of |w(0) > 


Ve (t),k =1,...,p and Uo(t). 


Answer: F(t, k1,..., kn) is the coefficient of 0(k1)...0(kn) in the sum of terms 
of the form 
(f < w(0)|Uo (t-t1) V (t1, 8) 0U (t1 ta) V (ta, 0)...U0 (tm—1—-tm) 
O<tm<...ti<t 
V (tm, 0)U6 (tm) |dt1...dtm)|X 
if < o(0)|Uo(t-t1)V (ti, 8)oU (ti tz) V (ta, @).-.Uo (tr_1 —tr) 
O<tr<...t1<t 
V(t,, 0)Uo(tn)|v(0) > dty...dt,) 
where m-+r=n and the terms 
2.Re( f < w(0)|Uo(t—-t1)V (ti, 0)oU (t1—t2)V (ta, 0)...U0(tn—1—tn) 
O<ty <..t1 <t 
V(t,, 0)Uo(tn)|X Uo(t)|Y(0) > dti...dt,-) 


Now derive an RLS lattice algorithm for estimating the parameter moments 
[lp(k1,..., kp) recursively in order and in time from the model (1). 


[6] Consider the p'” order Volterra system 


y(t) = sumf_, 5S) halts, ...,te)a(t — ty)...0(t — te) + e(t) 


By the use of the Kronecker tensor product, cast this equation in the form of a 
linear model of the type 


P 
ye= > D,(t,k, M)hux + et = Dz (t, M)gu + ex 
k=1 


where D,(t,4,M) is a data matrix built out of the input variables x(s — 
t1)...a(s —tp),8 <t,ti,...,t, =0,1,...,M and 


D,(t, M) = [D.(t,1, M),...,D2(t,p, M)), 
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&u = (hiv. seey hye 


Derive an RLS lattice algorithm for estimating gj, recursively in time ¢ and 
order M for a fixed p (p is the degree of the Volterra system and M is order). 


[7] Determine the wave operators for the Hamiltonian pair 


hint: 


x 


Hy = exp(-i i “Vico alae ean if V(2x)de) 
and hence 
exp(itH,) = exp(—i ‘| " Vie aa ieap tay da \eepG | “Wye 
and also make use of Taylor’s formula 
exp(td/dx) f(x) = f(x +t) 


Then, evaluate the wave operator 0, defined as 
limi+ocoexp(itH).exp(—itHo) f (x) 


Identify the domain of 04, ie, the set of all functions f for which the above 
limit exists in L?(R). 

[7] Let Ho, H, be two Hamiltonians with spectral measures Eo(.) and FE} (.) 
respectively. Show that the wave operator acting on a vector |f > is given by 


O4|f >= lime, .exp(itH, )exp(—itHo)|f > 


= limt-soo / exp(it(y — x))dE\(y).dEo(x)|f > 


Now assume that both Ho, H; have purely continuous spectra. Then, show that 
< g|Q4|f >=limsce | explity — 2) < gl B{(y)EG(@) > dedy 
R2 


2 
= liMisoo [exvlitty aye Fof{e2) dxdy 
yOu 


where 
Fy, ¢(2,y) =< g|Ei(y)Eo(x)|f > 
Now suppose that 


Hy, ¢(z) = i < g Ei (z+ 2) EO(2)|f > dr = [20F.s +a,x)dxr 
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exists and is is square integrable in 

< g|Q4|f >= 0 
Suppose now that 

Ay = Ao +6€.V 


where V is a random potential and € is a small perturbation parameter. Show 
that 


exp(itH,) = exp(itHy)+¢.exp(itHo) ((1—exp(—itad(Ho))/itad(Ho))(V)+O(e?) 
= exp(itHo) + €.exp(itHo).g(it.ad(Ho))(V) + O(e?) 


where 
g{z) = (1 — exp(—z))/z 


Hence, deduce that 
Q(t) = exp(itH)).exp(—itHo) = 


I + €.exp(itad(Ho))g(it.ad(Ho))(V) + O(e7) 
Hence assuming a given covariance function for the potential V, ie, 
Ryyv = a(V ® V) 


compute 


1((Q(z) — 1) @ (Q(s) — 1) 


upto O(e?) in terms of Ryy. Now go one step further in perturbation theory 
as follows: 


exp(tH,) = exp(t(Ho + €V)) = exp(tHo).W(t, €) 


say. Then, 
O.W (t, €) = €.exp(—tHo)V.exp(tHo)W (Et, €). 


= €.exp(—t.ad(Ho))(V).W (#, €) 
Thus, 


W(l1,6) = ie [ exp(—t.ad(Ho))(V)dt 


1 I. <t<1 exp(—t.ad(Ho))(V).cxp(—s.ad(Ho))(V)dtds+O(c*) 


Show that 


i exp(—t.ad(Ho))dt = (1 — exp(—ad(Ho))/ad( Ho), 
i: een ata) copies eia| Hoy) aide 


-/ exp(—t.ad(Ho))(1 — exp(—tad(Ho)))dt/ad(Ho) 
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= (1 — exp(—ad(Ho))/ad(Ho)* — (1 — exp(—2.ad(Ho)))/ad(Ho) 
= g(ad(Ho))/ad(Ho)* — 2.g(2.ad(Ho)) 


and hence obtain a formula for 


E(Q(t) — I) ® (Q(s) — D)) 
upto O(e?). 


[8] Let X(¢),¢ € Z be an M-variate zero mean stationary Gaussian stochastic 
process with autocorrelation 


R[k] = E(X(¢+k)X(t)7) eRY*” kez 


Prove that R[.] is positive semidefinite in the sense that if z1,...,Z) € R™ are 


arbitrary, then 
P 


S¢ 2p R[k — mam > 0 


k,ym=1 


Prove that this is zero iff : 
So 2, X[k] = 
k=1 


ie, the samples of the process are linearly dependent. Now consider the spectral 
density matrix of the process defined by 


a 7 Riklexp(—jwk) € CM*“ weR 
keZ 
Define the periodogram spectral density estimate 


N 


Hox .exp(—jwt)]. [S> X(t) T exp(jut)| 


t=1 
Then prove that 


N 


Ss Rit — s].exp(—jw(t — s)) 


t,s=1 


Sw) = 5 


N-1 
= S> (= [AI/N)Riklexp(—jwk) 
k=—(N-1) 


Prove that if 


>= I RIE] II< 00, 


keZ 


for any matrix norm || . ||, (note that any two matrix norms on the space 
of M x M are equivalent for M finite, ie, they generate the same topology, 
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which means that if the above series converges for any one matrix norm, it then 
converges for all the matrix norms), then 


limn el [Sv (w)] = S(w) 


Now using the formula for stationary zero mean vector values Gaussian processes 
(Xx (t1)Xi(t2) Xm (ts) Xn (ta)) 


= Ryil[ti — te] Rmn[t3 — ta] + Rem[ti — t3] Rim [to — ta] + Ren[ti — ta] Rim [te — ts] 


Obtain a formula for 


Covu((Sn(w1)) kis (Sn (w2))mn(we)) = 


(Sy (1) xi (Sw (w2)) mnl— 
2[S.y (w1) ki) -E[(Sv(w2)) mn 


and show that this does not converge to zero as N — oo. What does it converge 
to and what can you infer about this result ? 


[9] This problem outlines the steps for proving the singular value decompo- 
sition of an M x N complex matrix A. 

step 1: Show that P = A*A is an N x N positive semidefinite matrix and 
hence it can be diagonalized by the spectral theorem as 


P = UDU* 
where U is an N x N unitary matrix and 
D=diaglo?, 2467 ;0,4.;0|,01,2.560 50 


is a diagonal N x N matrix with exactly r positive entries and all the other as 
zero entries where r = rank(A) = rank(P) = rank(Q) where Q = AA*. 


step 2: Let 
U= [u1, w Uy] 


Then show that 
Vi = Aug/on,k = 1,2, wo T 


form an orthonormal set of r vectors in C@. Note that uj,..., uy are orthonor- 
mal in C’. Now extend the orthonormal set {v1,...,v,} to an orthonormal set 
{vi,-., va} in C™. Show that {v1,..,v,} forms an orthonormal basis for R(A) 
while {v,41,-., Var} forms an orthonormal basis for R(A)+ = N(A*). In fact, 
show that 

A*v;, = O,UgR, k= 1, 2, wong Ty 


A*v, =O0,k =r+1,...,M 
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step 3:Now define the M x M unitary matrix 
V =([v1,..Vaz] 
Then show that the discussion in step 2 can be expressed as 
MVSUy 


where S? is an N x M matrix of the form 
x, O 
/ san fie 
=(% 0) 


Xu, = diag[o1, .., or] 


where 


Conclude that 
A* =U».V* 


and hence 
A = V>.U* 


[10] Using the singular value decomposition for rectangular matrices de- 
scribed in problem [9], obtain the general solution to the least squares problem 
of computing a vector @ such that || x — A@ ||? is a minimum. Also determine 
that least squares solution @ having minimum norm and prove that it is given 
by 

0 = pinv(A)x 


where 
pinv(A) = UTVv* 


where [ is the N x M matrix defined by 
ye 0 
r=(% 0) 


[11] In this problem, we outline a step-wise procedure for statistical image 
processing on curved surface on which a Lie group of transformations acts. The 
ideas are based on group representation theory. 

step 1: Let M be a set on which a Lie group G of transformations acts 
transitively. By a group action, we mean a map 


7T:GxMA>M 


where by writing 
T(g,%) =9.2,g€G,rEM 
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we have 
(g291)-% = 92-(M-%), 91,92 € Gre M 


By a transitive group action, we mean that given any two x,y € M, we can find 
ag€Gsothat y= ga. 


step 2: Let dg be a left invariant Haar measure for the group G, ie, d(h.g) = 
dg,h € G or more precisely, 


[| seog= | fo)ag.n ee 
G G 
Choose and fix an 7) € M and define a map 


T:GS5M 


by 
(9) =920,9€G 

Note that by transitivity of the group action, T(g) covers the whole of M as g 
varies over G, ie, T is surjective:r(G) = M. Prove that if we denote the left 
invariant Haar measure on G' by du(g), then dv(x) = dwot'(x) is an invariant 
measure on M. Apply this argument to the special case of SO(3) acting on 
the unit sphere S$? to prove that that the measure induced on $? by the Haar 
measure on S'O(3) is the area measure sin(0)d0.d¢. 


step 3: Construction of the irreducible representations of SO(3) using spher- 
ical harmonics. The rotation group SO(3) acts on the unit sphere 
S? = {xe R®:| x |= 1} 


This action induces an action on the Hilbert space L?(S?) = {f :S? > C: 
Joo |f(%)|?dS(x) < co} in a natural way, ie for g € SO(3), U(g) : L?(S?) > 
L?(S7) is defined by 

U(g) f(x) = f(g" 'x),x€ 8° 


Prove that U is a representation of SO(3), ie, 


U(I3) = Ip2(92), U(g1)U(g2) = U(9192), 91, 92 € SO(3) 


To do image processing on the sphere, we must decompose the representation 
U into irreducibles. The first step is to note that the Lie algebra of SO(3) is the 
set of all real 3 x 3 skew symmetric matrices. This Lie algebra has a standard 
basis {iL1,iL2,iL3} satisfying the commutation relations 


[Li, Le] = iL3, [Le, L3] = iL1, [L3, Li] = tLe 


The differential dU of the representation U of SO(3) is a representation of the 
Lie algebra s0(3) of SO(3) in L?(S?). Its action is given by 


aU (A) f(x) = © flexp(—tA)x)|e=0, A € 90(3) 
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Equivalently, 
U(exp(tA)) = exp(t.dU(A)),t € R, A € s0(3) 


Note that 
dU (so(3)) = {dU(A) : A € s0(3)} 


is a three dimensional Lie algebra defined by the commutation relations 
[dU (A), dU (B)] = dU([A,B)), A,B € 50(3) 


We write 
Ly, = dU(Ly), k = 1,2,3 


and hence oe es 7 
[L1, Lo] = iLs, (Le, Ls] = —Li, 
[L3, L1] = iL, 
Ly, k = 1,2,3 are known as the angular momentum operators in quantum me- 
chanics. Decomposing U into irreducibles is therefore equivalent to decomposing 
the Lie algebra generated by the differential operators L,,k = 1,2,3 acting in 
L?($7) into irreducibles. 
Remark: Note that 


Y d : 
iL, f(x) = ait (eeP( tLe) x)|e=0, k=1,2,3 


Noting that exp(itL,) € SO(3) is a rotation around the z axis by the angle ¢ and 
likewise exp(itL2), exp(itL3) are respectively rotations around the y and z axis 
by an angle t, deduce that Ly = —i(y0/0z — 0/dy), Lz = —i(z0/Ox — x /0z), 
L3 = —i(x0/Oy — yO/Oz) all restricted to L?(S?). 


Exercise (a): Calculate the explicit forms of the differential operators Ly, k = 
1, 2,3 in terms of the spherical polar coordinates on the unit sphere 


S? 6,6,0/00, 0/6. 


(b) Define the second order differential operator L? = S~}_, L2. Prove using 
the commutation relations for the Lis, that L? is a second order differential 


operator that commutes with L;,, k = 1,2,3. It is known as the Casimir operator 
for the Lie algebra dU(so(3)). 


(c) Prove that L? is the negative of the angular part of the Laplacian in 
three dimensions and is therefore a self-adjoint operator in the Hilbert space 
TAS): 

(d) From (b) and (c) and the spectral theorem for self-adjoint operators 
in a Hilbert space, deduce that the eigenspaces of L? are left invariant under 
Ly, k = 1,2,3. In particular, show that £7? Tab are jointly diagonable and 
show by separation of the 6,¢ variables that these joint eigenfunctions are the 
spherical harmonics Y;,(x),x = (0,¢) € S? satisfying the eigen-equations 


i = Vim = (+ DYim, £sYim = MYim,m = l, l4 1,...,1 1,1,2=0,1,2,... 
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By the spectral theorem, {Yim : |m| < 1,1 = 0,1,...} form a complete orthonor- 
mal basis for L?(S?). 


(e) Define the ladder operators 
Ly = Ly +ily, Lb. =I -tls 
Prove that a . : : 
[L4, 3] = —iL, —iL, =—-L., 
[L_, £3] = iL) + Ly = L_ 
or equivalently, 
LE, (£3 +1) = £5,£,,£_(£3—1) = Lg h_ 
Hence, verify that 4 — 
(m+ 1)Ly Yim = L3L4Yim, 
(m—1)L_Yim = E3L_Yim 
Also L? commutes with L,,L— and hence 
D7 LiVie SA DLs, 
LY Hah ¥ 


Verify that upto a proportionality constant, there is just one function, ie, Yim 
which is simulteneously an eigenfunction of L? with eigenvalue I(J + 1) and an 
eigenfunction of D3 with eigenvalue m and hence conclude that 


La Vigs = c(l, OY acti Dy _ Yim = d(l, M)Yim—1 


for some complex constants c(1,m),d(l,m). Note that Yim is to be interpreted 
as zero ifm < —l or m > 1. Assuming that the Y;,,s are normalized in L?(S7), 
verify their orthogonality from basic properties of eigenfunctions of self-adjoint 
operators in a Hilbert : 


< Yim; Yim! >= sen (8, O)Yums (9, @)sin(@)dé, do _ Ou Omm! 
$2 


Conclude that {Yim : |m| < J} in an orthonormal basis for a subspace V; of 
L?(S) that is invariant under L;,,k = 1,2,3 and hence under U(SO(3)). 


(e) Prove that V; has no non-trivial subspaces that are invariant under 
Ly, k = 1,2,3, ie, the restriction of U(SO(3)) to V; is an irreducible repre- 
sentation of SO(3). Denote this representation by 77. 

hint: Use the properties of L3 and (ine Tee acting on Yim. 


(f) Prove that 7,1 = 0,1,2,... exhaust all the inequivalent irreducible rep- 
resentations of SO(3). 
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hint: For this, you must make use of the Peter-Weyl theorem which states 
that if G is a compact group, then 7;,/ = 1,2,... are all the inequivalent irre- 
ducible unitary representations of G iff any f € L?(G) can be expanded as 


fla) = ayers). 50 = f Slama)" ey 
I 


iff 
be = d_ d())xx(g), xi(9) = Tr(m(9)) 
l 


iff for any class function f(g) on G (By a class function, we mean any function 
that satisfies f(hgh—') = f(g)Vg,h € G), the relation 


I bar f(g)xi(g)dg = 0,¥l 


implies f = 0. x; is called the character of the representation 7. Prove that if 
m is the restriction of U(SO(3)) to Vj, then 


l 
xu(R.()) = S$) exp(—imy) = exp(ilh)(exp(—i(2l + 1h) — 1)/exp(—t) — 1) 
m=-—l 
= sin((l+ 1/2)W)/sin(w/2),1 = 0,1, 2,... 


using the fact that 7 
L3Yim aa MYim 


and hence A 


Note that x; is a class function since 
Tr(m(hgh-*)) = Tr(m(h)m(g)mi(h)~*) = Tr(m/(g)) 


Now let f be a class function on SO(3). Calculate the Haar measure on SO(3) 
in the form 


du(g) = F(8, , »)d0.do.dy,g = R.Rz().R™ 


where 
R= R,(8)R.(¢) 


and hence evaluate for a class function /f, 


20 


| fla)x(g)dg = fF (Re(b))xi(Be(B)) Poway 
SO(3) 0 


where 


Tw Qn 
Fol) = ‘ | F(0,4,)d0.de 
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Deduce that if this vanishes for all 1 = 0,1,2,..., then f = 0 proving the 
completeness of the irreducible representations 7; constructed as restrictions 
of U(SO(3)) to Vi = span(Yim : |m| < U}. 


(g) Construction of the left invariant Haar measure on a locally compact Lie 
group G. Let w,...,w, be a basis of left invariant one forms on G. Then in 
component form, we have 


we(X)(g) = we" (g)Xm(g) 


with summation over the index m where X,,(g) are the components of a left 
invariant vector field X at g. We have 


D(g) = det((wy"(g))) = (wt A A @n)(9) 


Now let L, denote left translation on G, ie, Lgh = gh,g,h € G. Then by the 
definition of left invariant vector fields and left invariant one forms, we have 


dLgX (e) = X(g), (dLg-1)"we(e) = we(g) 


Note that 
(dL,-1)" = (dL,)** 
Thus, 
wig) = we(e)o(dLg)* 
and hence 
wr(9)(X(9)) = we(X)(g) = we (e)o(dLg)~*)(dL, X (e)) = we(e)(X(e)) 


= wr(X)(e) 


is independent of g € G, ie, it is a constant. Now, let g = (g1,...,gn) denote 
coordinates for g € G. Then, 


[t0r'9 g)D(g)dqu.. gn = | H(9) D(hg)(detdLy,)dgi...dgn 


where L, the left translation by h represented in this coordinate system and 
hence dL, becomes the n x n Jacobian matrix of L;,. We have on the other 
hand, 

D(hg) = (a1 A... AWn)(hg) = 


(dLn)** (wi A... A wn)(9) 
= (dln ay Mew A (dL a) walla 
= ((dLp)~ wy A... A (dn) wn) (9 
= w1(g)o(dL;)~1 A... A Wn(g)o(dLy)~* 
= det(dLp)~1(w1 A... AWn)(g) 


) 
) 
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= det(dL;,)~'D(g) 


and we get the required invariance result: 
[109 g)D(g)dgn.. dan = f oD g)dgi...dgn, he G 


Further, we have that if X',...,X” is any basis for the space of left invariant 
vector fields on G (ie a basis for the Lie algebra of G) and if wy,...,wp is it dual 
basis, then w},...,Wp is a basis for the space of left invariant one forms on G. 
Left invariance of the latter follows from the fact that 


we(g)(X™(g)) = dx" 


by hypothesis and 
X*(g) Sal X*(e) 


so that 
5" = wp (g)(dLyX"(e)) = ((dLy)*we(g))(X™ (e)) = we(e)(X™(e)) 


and hence since X*(e),k = 1,2,...,n form a basis for the tangent space to G at 
e (ie, of the Lie algebra of G), it follows that 


(dLg)"wx(g) = we (e) 


or equivalently, 
wi(g) = (dLg)~*wi(€) 


proving left invariance of the one forms w},s. Now writing the equation 


we(g)(X™(g)) = dp" 
in terms of components, we get 
weg) Xr (g) = OK 
where summation is over r = 1,2,...,n and hence taking determinants on both 


sides, we get 
D(q).det((X;"(g))) = 1 


or equivalently, 
1 


PO) = Ta) 


In other words, the left invariant Haar density D(g) for the group G in any 
given coordinate system for G is just the reciprocal of the wedge product of a 
basis of left invariant vector fields on G, or equivalently of the determinant of 
the matrix formed by taking the components of n left invariant vector fields. 
This in fact gives a nice algorithm for computing the Haar measure on a matrix 
lie group. 
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The algorithm: 
step 1: Let G be a matrix Lie group of dimension n. Choose a basis 
{X1,..., Xn} for the Lie algebra of G. Represent any g € G as 


g = exp(t1X1)...erp(tr Xn), t1,-.,tn € R 
For each k = 1, 2,...,n, and t € R compute 
g.exp(tX;,) = exp(tyX1)...exp(tnXn).exp(tXp) 
and express it as 
g.exp(tX,) = exp(agi(t, tr, ..., tn) X1)..-exp(Gen(t, tr, -.-,tn) Xn), k = 1,2,...,0 


For a function f on G, we write 


Ff (tis tr) = F(g) = flexp(ts X1)...ep(tnXn)) 


and then observe that 


f(g-exp(tX,)) = f(aei(t, ti, «5 tn), Gea (t, tr, «5 tn), -; Aen (t, t1, ---, tn)) 


and hence if we use the notation X;, for the differential operator associated with 
the vector field X;, we get 


(Xef)g) = Filo.copltX,))lea0 = 


n 


si Obi lO: tis sta) OF Cigna) 
at Otm 


m=1 
This means that in the above set of coordinates (¢1, ..., tn) on G, the left invariant 
vector field X; is represented by the first order differential operator 
nm 
~ 0 0,t1,...,¢ 6) 
eee a Akm ( ae ee) 7) 
Ot Otm 


m=1 


and therefore the left invariant Haar density D(t,,..., tn) on G in the coordinate 
system (t1,...,tn) is given by 


Ditisonta) det (apg, Otiaste) ieee 


Note that the corresponding left invariant Haar integral f., f(g)dg is expressed 
in the above coordinate system as 


i f(exp(t,X1)...exp(tnXn))D(t1, ...,tn)dty...dtn 


= f Flt. stn)D (tr stn) ty 
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Example: Here we compute the Haar measure on S'O(3) in terms of the 

Euler angles. Any rotation R (ie, R € SO(3)) can be represented as 
R= R.(o)Re(O)Rz(v) 
@, 0, are known as the Euler angles. Note that 
R,(d) = exp(—iLg¢), Re() = exp(—iL16) 
We write X; = —iLy, k = 1,2,3. Then, 
R.(6) = exp(oX3), Ro(9) = exp(OX1) 
and 
[X1, X2] = —[L1, Le] = —iLs = X3, [X2, X3] = X1,[X3, X1] = Xe 
Note that X,, X2, X3 are real skew-symmetric matrices. Then, write 
f(0,0,0) = F(R) = f(exp(-Xs).cxp(0.X1).exp(b-Xs)) 

We find that the left invariant vector fields pee k = 1,2,3 associated with the 


SO(3) Lie algebra basis elements X;,,k = 1,2,3 respectively are given by the 
following computations: 


X3f (¢, 6,4) = © f(exp($.Xs).exp(0-X;).exp(y-Xs).cxp(tXs)) ino 


= f(exp(.X3).exp(0.X1).exp(.X3)X3) = ie f(¢,0,0) 
pt (o8.%) 
f(R Gr 2(O)X1R2(~)) = 
f(R(¢)Re(O)Rz(Y)Rz(-¥) M1 Rz(y)) 
= f(R.exp(—y.ad(X3))(X1)) 


Now 
exp(—t.ad(X3))(X1) = X1 — o.[X3, X4] + (W?/2)[X3, [X3, X1]) + 
= Xi = 0.X2 = (7 /DX +. = 


X 1.c0s(w) — Xo.sin(q) 


and hence, 7 
OF(O.0.8) _ 
00 


36 
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f(R.(cos(w)X1 — sin(w).X2)) = 
((cos(q)X1 — sin(w)X2) f)(R) 


and finally, 
Oo x 
agt 6, v) = 
= f(R.exp(—w.ad(X3)).exp(—@.ad(X1))(X3)) 
Now 


exp(—0.ad(X,))(X3) = X3 — O[X1, X3] + (67/2)[X1, [X1, X3]] +... 
= X».sin(0) — X3.cos(0) 
exp(—w.ad(X3)).exp(—0.ad(X1))(X3) = 
exp(—w.ad(X3))(X2.sin(0) — X3.cos(8)) = 
sin(0).exp(—.ad(X3))(X2) = sin(0)(Xz.cos(W) + X1.sin(w)) 

Thus, 
spile0.¥) 

= f(R.(sin(0)cos()X_ + sin(0)sin()X,)) 

= ((sin()cos(w)X2 + sin(A)sin(y)X1) f)(R) 
Thus we obtain the following correspondences, 

X3 + 0/du, 
cos(p)X1 — sin()X2 + 8/00 
(sin(0)cos(W) Xo + sin(0)sin()X1 
+ 0/0¢ 


Thus, have an expression of the form 
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where A(¢,6,w) is a 3 x 3 matrices whose elements are functions of (¢, 0, W) 
and by the above discussion, it follows that the Haar density on S'O(3) is given 
in terms of Euler angles by 


D(6,0,-b)d.d0.de 


where 


D(o,0,) = det(A(9, 0, ¥)) 


Remark: Kis k = 1,2,3 are left invariant vector fields which are expressed 
in terms of their components w.r.t the Euler angle coordinate system as 


X; = Xx1(¢, 6, w)0/0¢ + X42(¢, 0, w)d/0 + X13(¢, 0, w)d/oy, k= 1, 2, 3 


Thus, 7 : 7 
X1 A X2 A X3 = det((Xkm))0/O¢ A 0/00 A O/OW 


Note that 
det((Xkm)) = 1/det(A) 


if wz, k = 1,2,3 is the basis of one forms on SO(3) that is dual to the basis 
Xz, k = 1,2,3, then writing 
Wr = WEdh + WrodO + we3dyp, k = 1, 2,3 
we get that 5 
wr (Xm) = Okm 


and hence since (dd, d0, dw) is the dual basis of (0/0¢, 0/00, 0/dw), it follows 
that 
WkrXmr = Okm 


where summation on the left over the repeated index r is understood. Thus, 
wy Awe \w3 = det((wrr))do A dé A dy 


with 
det ((Wpr)) = 1/det((Xx,)) = detA = D 


[h] Induced representations for semidirect products. 


[i] Semidirect products. The prototype example here is the 3-D Euclidean 
motion group of rotations and translations. This group is represented as 


G =R’ @, SO(3) 


where ©, denotes semidirect product and its form is derived by acting two 
elements g1,g2 of G successively on a point x € R®. Let gi = (a1, Ri), g2 = 
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(a2, Ro) € G, ie, a1,a2 € R’, Ri, Re € SO(3). Then g; acts on a point x € R® 
by rotating it by R, followed by translating it by a1: 


g.« = Ryxt+ay, 
Further gz acts on g).x in the same way taking it to 
g2(gi-t) = Ro(Rix + a1) + a2 = R2Ryx + Roa, + a2 


= (Rea + aa, RoR).a 


This formula by which the action of two successive Euclidean motion group 
elements acts on a point in 3-D space is used to define the composition law in 
G: 

(92-91).© = g2-(g1-2) 


giving the composition law in G as 
92-91 = (a2, R2).(a1, Ri) = (Reai + a2, RoR1) 


This composition law is easily verified to be associative. In fact, we have 
(93-(92-91))-© = 93-((92-91)-®) = 93-(92-(g1-)) = (93-92)-(g1-) 


= ((93-92)-g1)-© 


for all x by definition from which we deduce the associative property: 


93-(g2-91) = (93-92)-91 


More generally, let N be an Abelian subroup of a group G and H another 
subgroup of G such that 
G=Nx,H 


which means that (a) Every g € G is uniquely expressible as g = n.h with 
n € N,h © H and (b) N is a normal subgroup of G which in view of the 
Abelian property of N and property (a) means that hNh~! = NVh € G. Then, 
we have for g, = n1h1, go = neh2 that 


9291 = nohe.nyhy = (nghgnihz").(hzh1) 


We can represent g = nh € G uniquely as g = (n,h) € N x H and hence, the 
above composition law can also be expressed as 


(na, he).(m1, hi) = (nghonhz', hgh) = (N2Tn, (M1), heh) 


where 7),(n) = hnh~', just as in the case of the Euclidean motion group with 
N =R° and H = SO(3). More generally, let N be any Abelian group and H 
and other group such that there is a homomorphism 7 : H — aut(N). This 
means that for any h € H,t, € aut(N), ie, tr(n) € NVn € N, ta(nine) = 
Th(m1).Tr(M2) Vri, m2 € N and Tp,0Th, = Thoh, for all hi, ho € H. 
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Remark: In our Euclidean motion group case, we have Tr(a) = Ra,R € 
SO(3),a € R°. All the properties required of 7 are satisfied here: 7 is a homo- 
morphism from SO(3) into aut(R*), aut(R*) being the multiplicative group of 
all non-singular linear transformations acting on R*, or equivalently, the group 
of all non-singular 3 x 3 matrices. In fact, Tp = R here. Then, Tr(a; + a2) = 
R(a, + a2) = Ra, + Rag = TrR(a1) + TR(a2) which proves that TR is an auto- 
morphism of R? and secondly, Tr, r,(a) = ReRia = Ry.(Ria) = TR, OTR, (a), 
proving that 7 is a homomorphism of SO(3) into the group aut(R*). Note that 
the composition operation (n1,n2) + n1n2 in the Abelian group N = R° is here 
given by addition of 3-D vectors. 


Coming back to the general case, we define the group G = N@, H by Nx H 
with the composition operation 


(2, h2).(m1, hi) = (Matha (m1), hahi) — — — — — (a) 
which in the Euclidean motion group case specializes to 
(a2, Re).(a1, Ri) = (a2 + Rea, RoR1) 
as obtained earlier. 


Exercise: Verify that (a) defines a valid associative product on G and makes 
G into a group. 


We can show that the homomorphism 7 required for defining the semidirect 
product is not important by proving that the same semidirect via a group iso- 
morphism can be reduced to the standard one: t,(n) = hnh~!. Indeed, define 
the group N = N x fey} ={(n,ez) :n € N and the group H = {ey} x H = 
{(en,h):h © H}. These are two subgroups of G = N x H and, let 


(en, h).(n,e#).(en,h)~* = (n',h’) 
Then, 
(ey, h).(n, en) = (n',h’).(en, h) = (n'th (en), h'h) = (n', hh) 
(taup(n), h) = (n’,h'h) 


from which we get 
hey, = TA) 
proving that 7 
(en, h).(n,e#).(en,h)~* = (ta(n),eH) © N 


Thus, N is isomorphic to N, H is isomorphic to H and the composition in G is 
given by 
(m2, h2).(mi, hi) = (math (m1), hehi) = 
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fo(heiihy )hohi 
where 
fin = (n2,€17), ho = (en, ha), 
and likewise for 71, h,. Thus in any semdirect product, we can always assume 


that the homomorphism 7 from H into aut(N) is always of the form t,(n) = 
hnh-'. For the Euclidean motion group, this fact reads as follows: 


(0, R).(a, 1).(0, R~').a = (0, R).(R7'x +a) = 2+ Ra = (Ra, 1).x 


so that 
(0, R).(a,I).(0, R~') = (Ra, 1) 


which is the same as saying that 
Tr(a) = Ra 
(0, R) is rotation by R, (a,0) is translation by a. 


(ii) Irreducible representations of a semidirect product. 

Now we address the problem of determining all the inequivalent irreducible 
representations of a semidirect product. This will enable us to do image pro- 
cessing for problems involving for example, estimating both the translation and 
rotation vector or more generally in the case of the Galilean group of motions 
for images in motion, estimating the translation vector, the velocity vector, the 
time delay and the rotation applied to an object field defined on R°. 


Consider first the case of a finite group G that is expressible as a semidi- 
rect product. Let N be an Abelian subgroup and H another subgroup that 
normalises N Suppose we can write 


g=nh,ne Nhe 


uniquely for each g € G, ie, 
G=Nx,H 


Let U be an irreducible unitary representation of G in a finite dimensional 
Hilbert space H. Then U(n),n € N is a commuting family of unitary operators 
in H and hence can be jointly diagonalized. This means that we can find 
characters yz,k = 1,2,...,M of N such that 


M 
U(n) = So xn(n)P,n EN 
k=1 


where {Px : 1 < k < M} is a complete spectral family in H, ie, Pf = 
Py, PePm = 0,k # m, S74, Py = I. Note that by unitarity of the U(n)'s 
and the representation property of U, the yx, satisfy 


Ixe(7)| = 1, xe (m1M2) = XK (M1) XK(M2),M1,N2 © N 
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Choose a yo € {x1,---Xaw} and for any character y of N, let V, denote the 
eigensubspace of U|, corresponding to the ” eigenvalue” x. In other words, 

Vy = {v EH: U(n)v = x(n)vVn € N} 


For example, 
Vy. = Pe(H) = R(Pe) 


Then, from the relation 
U(n)U(h)v = U(h)U(h7'nh)v 


and the fact that H normalises N, it follows that 


U(n)U(h)v = xo(hnh)U(h)v, Vu € Vy,,h € Hn Ee N 
Writing 
8,(n) = hnh~", 
and 
Bux(n) = x(8,') = x(h-'nh) 
for any character y of N, we get that 
U(n)U(h)v = Br.x(n)U(h)v, Vu EV, he H 
In other words, : 
U(R)V, = Ve,7,2 € HX EN 
Here, N denotes the character group of N. In particular, we have 
U(A)Vy0 = Venxorh € H 
Note that : : 
Bax € NVX EN, LEA 
Let 
O(xo) = {Brxo 2h € H} 


O(x0) is the orbit of yo in N under the action of H defined via the group action 
8B. Note that 
Bho Prix = BhohiX, hi, ha €H,xEeN 


(Prove this) 
We write 


w= OY 


x€O(xo) 


and claim that the irreducibility of U implies that 


W=H 
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In fact, this directly follows from the U(G)-invariance of W. To prove the U(G) 
invariance of W, we first observe that 


U(h)V, = Vp, »,2€ HX EN 


and {;,x is in the orbit of x which is also the orbit of yo whenever x belongs 
to the orbit of yo. Thus U(h) leaves W invariant for each h € H. Secondly, if 
n € N, then 

U(n)\Vy =Vy.xE N 


since 
U(n)u = x(n)u, Wu € Vy 


This proves that each Vy is U(n) invariant for any x € N and in particular, this 
is true if y € O(xo. The proof that W is U(G)-invariant follows from this. 
The next point to note that is that if we define 


Hy = {h © H : Baxo = xo} 


then Ho is a subgroup of H, called the isotropy/little group of yo. We now 
claim that by defining 
a(h) =U(A)|vy,,. © Ho 


we get a representation o of Ho in V,,, and further, the irreducibility of U implies 
the irreducibility of o. First, note that the representation o of Ho is well defined 
since 

U(R)Vy, = Vyos he Ho 


Now, suppose Wo is a o(Ho)-invariant subspace of V,,. Then, for any h € H, 
U(h)Wo equals Wo if h € Ho and otherwise, it is a subspace of Vg, ,,. Further, 
U(n)Wo = Wo for each n € N since U(n)v = x(n)v for each v € Wo because 
Wo C Vy,. Thus, we have proved that 


W = DU (he) Wo) 
k=1 


is U invariant where {h,,...,h,} C H is any complete set of representatives of 
H/Ho (ie H is the disjoint union of h,Ho,k = 1,2,...,r) and hence, {8n, x0 : 
k =1,2,...,r} = O(vo). But then, 


W=H 


since U is irreducible. Hence it must be true that Wo = V,, (Note that the 
subspaces U(h,)Wo, k = 1,2,...,7 are mutually orthogonal subspaces of H, each 
having the same dimension, dimW . They are orthogonal because U(hy)Wo C 
Ven,xo0 and the latter are all orthogonal because they are the eigensubspaces 
of U|y with different eigenvalues and U|y is a unitary representation). This 
completes the proof of the irreduciblity of o as a unitary representation of the 
little group. 
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Conversely, suppose o is any irreducible representation of the little group of 
a character yo € N (appearing in the semidirect product G = N x, H) in the 
Hilbert space Vo. Then we can reverse the entire argument above to arrive at 
an irreducible representation U of G in a Hilbert space H. Formally, to see how 
this construction is carried out, we first construct the H orbit of yo: 


O(xo) = {Brxo the H} oa {BnyX0,k ag 0, 1,2, rey 1} 
where hyHo,k = 0,1,2,...,7 — 1 with hy = e are all the distinct and hence 
disjoint set of cosets of Ho in H: 


r-1 


H=|J ho 
k=0 


We then formally attach a vector space V;, to the character 8p, x0 for each 
k=0,1,...,r —1 so that 

U(hy)Vo = Vn, k = 0,1,..,7-1 
Then define 


r-1 
H=OD\ 
k=0 


as an orthogonal direct sum. It remains to define the action of U(G) on H 
compatible with the above definitions. This is done as follows. Let g = nh € 
G,n € N,h € H. Then let v € Vy for some & = 0,1,2,...,.r—1. Let s € 
{0,1,...,7 — 1} be the unique element such that 


hh, € heHo 
Then, 
U(h)v € V, 
will be defined as follows. Choose an onb {¢0,1,.--,0,m} for Vo and then for 
each k = 1,2,...,r — 1, define an onb {x1,...;k,m} for Ve so that 
U(hx) 0,1 = Pk,ls k= 0, 1, vee Pe 1,/ = 1, 2, coey MN 


Then, for h € H, we can write hh, € hsHo as above. Thus, hh, = hsho for 
some hg € Ho. Then 


U(h) ox, = U(h)U (he) bon = U(hhe) dbo 
= U(hsho)¢0, = U(hs)U (hg) 0, = 


U(hs)o(ho)¢0, = U(hs) S [a(ho)lvido.r 
V=1 
= S [a(ho)lviU (hs) bo, 


= 
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= Vlo(ho)liids.e 
Vv=1 


This formula defines the representation U of G on the space H spanned by the 
onb {on :0<k <r—1,1<1< m}. Note how the action of U(n),n € N is 
defined: 

U(n) r,t = Br, Xo0(n)or,t = Xo(hy rhe) be, 


Note that this implies 
U(n)V;, = Vuk = 0, Is eee 1 


By reversing the argument above, it is easily proved that U is an irreducible 
unitary representation of G in H. In fact, we first observe that since {@,1 : 
1 <1 < m} is an onb for V;, by definition, and U(hx) do. = bx, it follows 
immediately that 

U(hk)Vo = Vek = 0, 1, wey TP 1 


Now suppose that W is a U(G)-invariant subspace of H. Then consider the 
subspace Wp = WN Vo of Vo. We claim that Wo is a o(Ho)-invariant subspace 
of Vo. This immediately follows from the definitions. Hence by the irreducibility 
of o(Ho), it follows that Wo = Vo and therefore that Vo C W and since 


U(hy)Vo = Vi, k = 0,1,...,7-1 


and 
U(h,)W =W,k =0,1,...,7—1 


by W-invariance of U(G), it follows that 


Vie = U(hy)Vo C U(hy)W = W,k =0,1,....7—1 


Therefore 
r-1 
H=QBV.c W 
k=0 
ie, 
W=H 


proving the irreduciblity of U(G). This completes the construction. Note that 
the dimension of the irreducible representation U of G is related to the dimension 
of the irreducible representation o of the little group Ho by 


dimU = rm = o(H/Ho).dim(c) = 0(O(xo)).dim(c) 


Let G be a compact group acting transitively on a manifold M. The proto- 
type example of this is SO(3) acting on S?. Take an image field f; : M] > C. 
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The image field fz on M after transforming it by a g € G and adding noise to 
it is given by the statistical model 


fo(z) = fi(g”'z) +u(2),c EM 


(i) The irreducible representations of S,,-the permutation/symmetric group. 

Preliminaries: 

[1] The group algebra of a finite group. let G be a finite group. Its group 
algebra consists of all formal linear combinations 


f= > fo 
gEG 
where f : G > C is arbitrary. We denote this set by A(G). If fi, fo € A(G), 
their product is defined by 


fifa= S> fila)felh)gh = Sofi * fe)(9)9 


g,hEeG gEG 
where 
(fi * fo)(g) = So filgh™") fon) = $= filh) fo(h-*g) 
hEeG heG 


is called the convolution of the functions f; and f2 on G. Addition and scalar 
multiplication in 2((G) are defined in the usual way: 


chit fa= So (chi(g) + fo(g))g,cEC 


gEG 


With these operations, A(G') becomes an algebra and is called the group-algebra 
of G. Equivalently, A(G) can be viewed as the set of all complex valued functions 
on G with multiplication defined by the convolution operation as above and 
addition and scalar multiplication defined in the usual way, ie, pointwise on G. 


[2] Minimal projections. G is again assumed to be a finite group and G the 
set of all inequivalent irreducible unitary representations of G. Note that since 
G is finite, any finite dimensional representation of G is equivalent to a unitary 
representation. We denote by {Da(g) : a € G} a complete set of irreducible 
unitary representations of G. We have by the Peter-Weyl theorem, for any 
function f on G, 


f(g) = ye c(a, 7, J) [Dali(9) 


1<i,j<d(a),aEG 


where d(a) is the dimension of the representation D, and 


c(a,i, 7) = d(a) S> f(g) 


gEG 
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A projection p € 2(G) is defined by the condition 
p=p 


Let p be a projection. Then, by Peter-Wey] theorem, 


gEeG aceG 
where 
d(a) 
Palg) =d(a) S> <p, [Dali > [Da(g)lis 
ij=l 


or equivalently, 
where 
and 


for any two functions u,v on G. Now, by the Schur orthogonality relations, we 
have as elements of A(G), and with 


c(p, @,%,7 >= d(a) < p, [Dalij > 
Pa-Pp = > (Pa * Pa)(9)-9 
gEG 
and 
(Pa * pa)(g) = >. e(p, a, i7)e(p, B, km) ([Daiz * [Da]am)(9) 
with 
([Daliz * [Dalem)(g) = 
So [Do(A)lij-[Dslam(h™*9) 


=) [Dalh)liglDa(h™ )latlDa(g)lim 


= 2 [Dalh)lig[barDa(h)ix[Da(g)lim 
heG 


and this is zero if a 4 8 (Schur’s orthogonality relation which states that matrix 
elements of inequivalent irreducible unitary representations are orthogonal) and 
if a = 6, then this is contained in the vector space of functions 


Va = span{[Do(g)lij ? 1S 4,5 < d(a)} 


Advanced Probability and Statistics: Remarks and Problems 47 


Thus we get in 2(G), 
Pa-Pp = 0,0 # B,pe € Va 


Spa p= > Paps =e, 
a a,B a 


which implies (since py, p2, are in V, and the V/s are mutually all orthogonal 
that 


Hence, 


=e 


or equivalently in terms of functions, 


(pa * ps)(9) = 5(a, B)pa,a,8 EG 


A projection p is said to be minimal if it cannot be decomposed as 


P=pitpe2 


with both p,; and pz being projections. Thus, by the above decomposition, 
p is minimal iff p = pg for some a € G. We have thus proved that the set 
of all minimal projections in the group algebra of a finite group is in one-one 
correspondence with the set of all irreducible representations of G. Now suppose 
p is a minimal projection associated to a € G. Then, we can write 


or equivalently, 


p=) c(ij)[Dalij 
a 
in the group algebra A(G). The condition p? = p implies that 


Y= elim) [Dag)lim-g =P = D> clis)e(km)[Daiz-[Dalim 


im,g igkm 
= S- c(ij)e(km) [Dali * [Dalkm(g)g 
igkm,g 


= YO elife(km)(S 7 [Dalh)liy-[Da(h)hix) $7 [Dag ]im(9)-9 


ijkmyl heG geG 


= YF clij)e(km)o(G).d(a)*6(4,1)5(5,.8)- YF[Da(g)|tm-9 


igkml gEG 


and therefore, 
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or equivalently, in matrix notation, 


(d(a)/o(G))C = C* 


If we further impose the restriction that p is a central projection, ie, p com- 
mutes with A(G) apart from being minimal, then the only solution to the above 
equation is 


= (d(a)/0(G) Tava) 
and hence we find that in this case 
d(a) 
p(g) = (d(a)/o(G)) > [Palg)i = (d(a)/0(G))xa(g) 


where Yo(g) = Tr(Da(g)) is the character of the representation Dy. 


Prove that in A(G), 
d(a) 
p= Ss c(tj) [Daly 
ij=l 
BEG a8 #a. (In fact, the Schur 
[Dglijp = 90,8 Aa. Prove further 
.., (a) iff C = ((c(iz))) is a scalar 


commutes with all [Dg]i;,i,j = 1,2,...,d(9), 
orthogonality relations imply hat D. Daly = 
that p also commutes with [Da].j;,i,7 = 1,2,. 
multiple of Iga). 


Remark: We have shown that the problem of determining all the mini- 
mal central projections in A(G) is equivalent to determining all the irreducible 
characters of G which is in turn equivalent to determining all the inequivalent 
irreducible representations of G. This fact will play a fundamental role in our 
determination of all the irreducible representations of the permutation groups. 


[3] Si, is the group of permutation of m elements. Any o € S;, can be 
represented as 


o= (i1, vey by +i day Pre ECS ren Cire eee +1,.., Uy +.tle) 


where (i1,...,¢1,+...41,) is a permutation of (1,2,...,m) and if a1,...,a@, are dis- 
tinct integers in {1,2,..,m}, then (a1,...,a,) denotes the cyclic permutation that 
sends a; > a;41,1 = 1,2,...,r — 1,a, — a, and leaves the other integers fixed. 
In short, (a1,...,@,-) is a cyclic permutation in S,, with cycle length r. We thus 
state this result as: Every permutation is a product of disjoint cycles. Further, 
in the above notation, let p denote the permutation {1,2,...,m}— {i1,...,¢4m} 
where of course m = 1; +... + Jp. We also define the permutation 


g= (1, 2, very). (Ly Tr ihe Hugh + Ia)...(4 shes fb Ip—14 + as els spose sb Ix) 


expressed as a product of cycles. Then, it is clear that 


Tp = pg 
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ie, 

o=p. g.p 
It is clear from this formula, that each conjugacy class in S,, consists precisely 
of those elements having the same cycle structure. More precisely we say that 
a permutation o € 1*12*2..m*~ iff in the cycle representation of , there are ky 
cycles of length 7 for each 7 = 1, 2,...,m. Of course we must have ae jk; =m. 
The conjugacy classes in S$, are therefore labeled by the integers (k1,..., km). 
1*1_..m*~ is a conjugacy class. The number of elements in this conjugacy class 
is easily seen to be 


m! 
key)..-Ky LL. amkm 


u(kr, akg km) = 


in fact, first simply write down all the cycles in this class serially as above in 
non-decreasing order of their lengths. Then we can permute all the m elements 
in this serial representation in m! ways. However, a given cycle of length 7 can 
be represented in j possible ways by simply by applying a cyclic permutation 
to the elements in this cycle and there are j possible cyclic permutations. So 
the number of permutations within each cycle which do not alter the cyclic 
representation is II;j*/ because there are k; cycles of length j and each such 
cycle can be represented in 7 possible ways by applying cyclic permutations. 
Further, the cyclic representation of a permutation is not altered if we simply 
permute the cycles of the same length amongst themselves. The total number 
of such permutations is simply ky!...k,,!. Thus we obtain the above formula. 


[4] Young frames and Young Tableaux: Given positive integers m, > m2 > 
.. => Mp > O, such that m; + ...+ mz, = m, we draw a tableaux consisting of 
rows of boxes one below the other starting at the same left end line such that 
the first row has m, boxes, the second row has mz boxes,... the k*” row has mz 
boxes. We denote such a frame by F'(mj,...,m,) and call it a Young frame. It 
is easily seen that the total number of Young frames for fixed m is simply the 
total number of conjugacy classes of S,,. In fact, we can directly construct a 
bijection from the set of all conjugacy classes onto the set of all Young frames 
as follows. Given a conjugacy class 1*1..m*~, some of the j* will be missing, 
ie, k; = 0 for such 7. Thus, we can express this conjugacy class as rh nP 
where |j,...,J, > 0 and 1 < ry < rg < .. < rp < m, ie, in each element of 
this conjugacy class, there are 1; cycles of length j for each 7 = 1,2,...,p. Then 
we define my =... = mM, = Tp, 41 =» = M,41p-1 = Tp-1 ete, ie, in other 
words, as an ordered k-tuple, 


(m1, m2, shy tIVE) = (rp, Pps TpHl1y eres Tp ees M15 ee r1) 


where r; occurs /; times for each 7 = p,p —1,...,1. Note that 
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Since the number of distinct classes of G = S, equals the total number of 
inequivalent irreducible representations of G, it follows that this number is also 
the same as the number of distinct Young frames. 

A Young tableau F(T’) corresponding to a Young frame F' = F'(m,..., mx) 
is simply an arrangement of the m integers 1, 2,...,m in these m = ee boxes, 
ie, in each box, we put an integer from 1,2,...,m and no two boxes have the 
same integer. Associated with the Young tableaux F(T), we define R(T) to be 
the group of all permutations of each row of F(T) and C(T) to be the group 
of all permutations of each column of F(T). Note that the number of elements 
in R(T) equals m,!...m,! and the number of elements in C(T) equals nj!...n,! 
where 7,...,7, are the column lengths of F(T’). Note that R(T) and C(T) are 
subgroups of S;,,,. Define 


E(T) = Yo pa(-1)4 = P(T)Q(T) 


pER(T),qER(T) 


where 


P(T)= >) pQ@)= YS Ci 


pe R(T) qeEC(T) 


These elements are understood to be interpreted as elements of the group algebra 
A(Sm) of S,. Our aim is to prove that corresponding to each Young frame F, 
there is exactly one irreducible character xr of Si, or equivalently, exactly 
one inequivalent irreducible representation of S;,, and as F' runs over all the 
Young frames, x will run over all the distinct irreducible characters of Sin. 
The second part is obvious if we can prove that for two distinct Young frames 
F, F’, the irreducible characters yr,’ associated with them are distinct since 
as remarked above, the number of distinct Young frames equals the number 
of classes of $;, and the number of classes of finite group equals exactly the 
number of inequivalent irreducible representations of the group. 

Now let T,T’ be two Young tableaux. We say that TRT’, ie, T is related to 
T’ if there exists a pair of indices (i, 7) belonging to the column of T and to the 
same row of T’, otherwise, we say that TN RT’, ie, T is not related to T’. now 
suppose T RT’. Then we have for some (i, j) that 


(i,j) € C(T) N R(T") 
and therefore, with (i, 7) denoting the transposition of (7,7), we have 
(i, j)P(Z") = P(T"), C(T) (4, 7) = -C(T) 


Now let T,T’ be two Tableaux with corresponding frames F = F(T), F’ 


F(Z"). Note that F = F(T) is completely specified by integers m; > m2 > 
.. > Mrz > 0 and F’ = F(T") is completely specified by integers mi, > m5 > 
. => mj, > 0. We say that F = F’ ifl = k and mj; = mj,j = 1,2,...,k. 


Otherwise, we write F 4 F’. We write F > F” if for the first index 7 = 1, 2,... 


for which m; # mj, we have m; > mj. Obviously F 4 F” iff either F > F" 


or else F’ > F. Now suppose F = F’ and TNRT"’. First observe that by 
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definition of NR, all the entries in the first column of T occupy different rows 
of T’. Then consider the first two entries in the first column of T. By definition 
of NR, these two entries must fall in different rows of T’. Hence, by applying 
a column permutation q; to T and a row permutation p/, to T’, we can ensure 
that these two entries occupy the same positions in qiT and in pT’. Then 
we also observe that qTNRp{T’ and hence by applying the same argument to 
these new Tableaux, we can ensure the existence of a column permutation q2 of 
qT and a row permutation ps of p|T’ such that that the next pair of entries 
in the first column q2qiT occupy the same positions in p,pT7’ without altering 
the positions of the previous pair in the two tableaux. In this way, we finally 
end up with a column permutation g of T and a row permutation p’ of T’ such 
that gI = p'T’. Then, T’ = p ~!qT and we get 


T = qt p'T’ 
and 
q'raéeq ‘RT )qg=q 'R('T')q = RQ 'pT’) = RL) 
Thus defining 
p=q vq 
we get that 
pe R(T),q¢¢€ C(T),q‘p' =pq*, T =pq'T",T' =qp 'T 


We have thus proved the following theorem: 


Theorem: Let T,T’ have the same shape. Then, if TN RT’, there exists a 
p€ R(T),q € C(T) such that T’ = qpT. 


Now, let T be any tableaux and let g € Si, be such that g ¢ R(T).C(T). 
Then, g-' ¢ C(T)R(T) and we prove the existence of po € R(T), qo € C(T) 
such that (—1)% = —1, pogqo = g. Indeed, define T’ = g-!T. Then T” cannot 
be written as abT where a € C(T),b € R(T). In fact the equation T’ = hT 
uniquely determines h as g~'. Thus, by the previous theorem, T RT’ and hence 
there exists a pair (i,7) falling in the same column of T and the same row of 
T’. Define 

go = (i,j), Po = 999 9 * = 9909" 
Then, we get 
(—1)” = —1, pogao = 9; 
and 
po = 999 * € gR(T’)g* = R(gT’) = R(T) 


and the proof of the claim is complete. We state this as a theorem. 


Theorem: If g ¢ R(T)C(T), then there exists a pp € R(T) and a q € C(T) 
such that (—1) = —landpogqo = g. 
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Now we are in a position to prove one of the main theorems in the Frobenius- 
Young theory. 


Theorem: Let s € (S,,) be such that 


psq = (—1)’sVp € P(T),q € Q(T) 


Then 
s = s(e)E(T) 


Proof: Write 
$= s- s(g)g,G = Sp, 


gEG 


Then the stated hypothesis implies that 
s(pgq) = (-1)%s(g), Vg € Gp € R(T),q € CZ) 


(Note that R(T), C(T) are subgroups of G so that p € R(T) iff p~' € P(T) and 
likewise gq € C(T) iff q~' € C(T)). Taking g = e gives us 


s(pq) = (—1)%s(e),p € P(T),¢ € Q(T) 


Now suppose g ¢ P(T)Q(T). Then by the previous theorem, there exist po € 
P(T), qo € Q(T) such that 


(—1)® = —1, pogao = 9 


Then, 
s(g) = 8(pogqo) = (1) s(g) = —s(9) 


and hence s(g) = 0. Therefore, we have proved that 


s= > s(pq)pq = s(e) S° (=1)"pq = s(e) E(T) 
peP(T),qEQ(T) pe P(T),qEQ(T) 


and the proof of the theorem is complete. 


Corollary: E(T)? = k(T)E(T) for some k(T) € R. In fact, we have 


and hence, 
pE(T)°q = pP(T)Q(1T) P(T)Q(T)q = (-1)1E(L)’,p € P(T), 4 € Q(T) 


since 
pP(T) = P(T), Q(T)q = (-1)Q(T), p € P(T), 4 € Q(T) 
Thus, by the theorem, 
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for some k(T) € R. To evaluate the value of k(T’), we evaluate Tr(Rgr)) and 
Tr(Rgp)) = Tr( Rey), where for any s € A(G), we define the linear operator 
on the vector space A(G) by 
Rf = fs 
Then, for any g € G, it is clear by choosing the standard basis {h : h € G} for 
A(G) that 
Trik;) = 0,9 ¢€.e;Tr( hk.) =m! 


(Recall that G = S',,). Hence, since 


R= » s(g)Rg,s € A(G) 


geG 
we get that 
Rar) = x (—1)* Rpg 
pEP(T),qEQ(T) 
so that 
Tr(Rery)) =m! 
Now let 


d=rank(Rgr)) = dimR(Rer)) = dim[A(G)E(T)| 
Then, let {f1,..., fa} be a basis for (Rgyr)). We define the linear operator 
A=k(T)7' Rar) 
acting on the vector space A(G). Then, 
APSA 
ie, A is a projection. Thus, 
Tr(A*) =Tr(A) =d 
Combining these two formulae gives us 
d=Tr(k(T)~*Rar)) = k(T)*m! 


so that 
k(T) = m!/d 


and we conclude that 

E(T)? = (m!/d)E(T) 
Now define 

e(T) = (d/m!)E(T) 


Then, 
e(T)? = (d/m!)?B(T)? = (d/m!) B(L) = e(T) 
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in other words, e(T) is an idempotent element of the group algebra A(G). We 
also note that if T,T’ are different Tableaux having the same shape F = F(T) = 
F(T") = F’, then e(T)e(T") = 0. 


[i] The Frobenius character formula for induced representations of a finite 
group. 

[1] Alternate definition of the induced representation. 

Let G be a finite group and H a subgroup of G. Let L be a unitary repre- 
sentation of H in the Hilbert space Y. We define 


U = IndGL 


ie, U is the representation of G induced by the representation DL of H. There 
are many equivalent ways to define U. All these definitions give isomorphic 
representations of G. One way is define the representation space X of U as 
the set of all f € C(G,Y) for which f(gh) = L(h)~'f(g),h € H,g € G and 
then U(g)f(z) = f(g7'z),g,c € G,f € X. It should be noted from this 
definition that f € X is completely determined if its value is known on any 
set of representatives of the cosets G/H. Hence, we may equivalently view any 
f € X asa map from G/H > Y. In fact, let y be a cross section map for this 
coset space, ie, y: G/H — G is such that y(x)H = x,x € G/H. Then consider 
the element of G 


A(g, 2) = y)(x)~"9y(g7*x),g € G,x € G/H 
Clearly since 
h(g,2)H = (2) 97(g"'x)H = (2)"'9g"'xH = H 


it follows that 
h(g,x)€ H,geGheH 


Then for a mapping w : G/H > Y, consider 
(V(g)b)(2) = L(h(g, x) )d(g*2) 
We observe that 
(V(92)(V(g1)b)) (a) = L(A(g2,x))(V (1) )(93 °2) 


= L(h(g2, «))L(h(g1, 97 't))b(9y 92 '2) 
= L(h(g2,x)-h( gi, 92 't))v((gag1)~*2) 


and 
h(g2,2))h(gi, 93 2) = 


(9(2)* 9293 '2))-(9(92 '2) (97 92 2) 


= 7(x)~*g2917((9291)*x) = A(gogi, x) 
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and therefore, 


V(g2)V (gi) o(x) = L(g291, 2) ((g291) x) = V(gog1) (2), 


ie, as operators in C(G/H,Y), the set {V(g) : g € G}, satisfies 


V(g2)V (gi) = V(geg1), 91,92 € G 


Thus, V(.) is a representation of G in C(G/H,Y). We wish to prove that V is 
equivalent to the induced representation U = IndGL. Indeed, consider the map 


T:C(G/H,Y) > X 


defined by 
(TY)(g) = L(h(g*, H))b(gH),9g € G 


To see that this map is well defined, we must first show that the rhs is indeed 
an element of X, ie, it satisfies 


(Tv) (gh) = L(h)-"(Td)(9),9 EG, he H 


Indeed, this follows from 


(Tw)(gh) = L(h((gh)~", H))d(gH) 


and 
A((gh)~*, H) = 7(H)~'(gh)~*y(ghH) = (gh)~*y(gH) 


on assuming without loss of generality, 


Thus, 


and therefore, 
(Tp)(gh) = L(h)*L(h(g-*, H)b(gH) = L(h)* (Tv) (9) 
proving thereby that 
TwEex 


and hence the map T is well defined. We next show that T is a bijection. Indeed, 
Tw = 0 implies 
L(h(g"*, H))b(gH) =0,9 €G 


which implies that 
~(gH) =0,gEG 
ie, 


w=0 
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Thus, T is injective. Next, suppose ¢ € X. Then, define w € C(G/H,Y) by 
Y(gH) = L(h(g"', H))~'9(9),9 €G 
To show that ~ is a well defined element of C(G/H,Y), we must show that 
L(h((gh)~*, H))~*@(gh) = L(g", H))""6(9), 9 € G,he H 
But, for al he H,geG 
h((gh)~*, H) = y(H) tho" g7'4(ghH) 


ah g V¥@H) HW Ag, A) 
Then, 


L(h((gh)~*, H))~* @(gh) = L(h*h(g~*, H))~*.L(h)~* 99) 


= L(h(g~', H))*4(g) 


proving that w € C(G/H,Y) and by construction, Ty = ¢. Thus, T is also 
surjective. Hence, T is bijective. Finally, we prove that T intertwines the rep- 
resentations U and V and this will establish the equivalence of these two repre- 
sentations and hence provide an alternate equivalent definition of the induced 
representation. We have for w € C(G/H,Y), 


(U(g)T¥)(q1) = (TY) (g-*1) = 
L(h(g; ‘9, H))b(97* gH) 
on the one hand, while on the other 
(TV (g)v) (91) = L(A(gy *, H))(V(9)) (nH) 
= L(h(gy', H)).L(h(g, 9. H))b(g7 nH) 
= L(h(g,*, H)-h(g, nH) )o(g oA) 
and since 
A(gy', H).h(g, 91H) = (A) 9p y(n ).Vn) 99 1) 


=9, '9V(9 ‘"H) =h(gy ‘9, A) 
it follows that 
U(g)T =TV(g),g EG 


ie, T intertwines the represenations U of G in X and V of G in C(G/H,Y) and 
since T' is a bijection, we can write this intertwining relation as 


Vig) =TU(g)T,g €G 


proving the equivalence of U and V. 
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[2] Frobenius character formula for the induced representation. 

Let G be a finite group and H a subgroup of G. Let L be a unitary rep- 
resentation of H and let G = I nd& L be the induced representation. Let L 
act in the Hilbert space Y. Then Let {phii,...,¢n} be an onb for Y. De- 
fine c = ./o(G)/o(H) and f(g) = cL(g)~bn,ifg € H and fx(g) = 0 if 
g € G—H. We claim that {f1,...,f,} is an on set in the representation 
space X of U. Note that this space X is defined as the set of all elements 
f € C(G,Y) for which f(gh) = L(h)~'f(g),g € G,h € H. First we observe 
that if g € G— H,h © H, then gh € G — H in which case f,(gh) = 0 = f(g) 
by definition. Thus in this case, f,(gh) = L(h)~"fx(g) = —0 holds. Next we 
observe that if g,h € H, then gh € H in which case, we get by definition, 
fr(gh) = c.L(gh)~*o, = L(h)~'c.L(g)~\¢,p = L(h)~' fe(g). This completes the 
proof of the claim that f, © X,k =1,2,...,n. Next we calculate 


< ths dm P= 5) 5 < fx(9), fm(9) me 


gEG 


aD < L(h)~' bp, L(h) ‘om > 


) fen 
ce c?0(H) 
———— < br, 9m >= Skm = Okm 
o(G) - o(G) 
Since 
o(G)/o(H) 
Thus, {f1,..-, fr} is an onb for X. Let {k, = e,ko,,...,k,}, be a complete set 


of representatives of G/H, ie, k;H, 7 =1,2,..,r are all disjoint elements in G/H 
and j_,k;H = G. Then we claim that B = {U(kj)fx,j = 1,2,.-.,7,4 = 
1,2,...,n} is an onb for X. First observe that 


dimX = (0(G)/o(H)).dimY = rn 


So if we are able to prove that the elements of 6 defined above are orthonormal, 
then our claim will be proved. But, 


U(k;) fas U (ka) fm >= 0G)". S$) < fab 9): fm(kp +9) >= 


gEG 
= 0G) S70 < Fela), fin (hy "hig >= 0G)". D7) < felt); fin (hey jh) > 
gEG heH 
This is clearly zero if | # 7 since then ky *k; ¢ H, ie, the coset ky kj; H is 


disjoint from H. On the other hand, when / = j, the above evaluates to 


0G)" $0 < f(b), fm(h) TST < L(A)" be, L(h) bm > 


heH heH 


=< Pk; dm >= Okm 
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This completes the proof that B is indeed an onb for X. Now let C be a class 
in G. Let yy denote the character of U and x; the character of L. We define 
xu(C) to be the character yy of U evaluated at any element of C. Note that 
these are all equal. Then, we can write 


M 
c()H=Ua 
l=1 


Where the C/s are disjoint classes in H. In the language of group algebras, we 


define 
C= ba g 
gECc 
Then, 
gC.g | =CVg EG 
and 
=S°U(g) 
gECc 
And hence, 


Tr(U(C)) = o(C)-xu(C) 


on the one hand, while on the other, 


> < U(kj) fr, U(C)U (ky) fr > 


=> < fe, U (ej )U CU (ky) fe > 
j,k 
_ = < fr, U C) fk > 


=(r/o(G)) So < felh), f(g *h) > 


k,geC,heH 


=(re?/o(G@)) So < L(A)“, L(A7"g) bx > 


k,g€CNH,heH 


M 
=f. .? < or; L(g) bx >=. :? x1(g) =f. S © 0(Ci)xx(Cr) 


k,gECil gECi,l l=1 


Thus, 
M 
xu(C) = (r/0(C)) 95 o(Cr)x1(c2)- 
I=1 


Note that r = 0(G)/o(H) = c? and M is the number of H — classes in CN H 
while C;,l =1,2,...,M is the enumeration of these classes. 
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[3] The Frobenius reciprocity theorem. Let 1 be a character of G and X2 a 
character of H. Let x2 denote the the character of G induced by x2. Then we 
have by the Frobenius reciprocity theorem, 


<xX1,x2 >= 0(G)* S> Xi(9)x0(9) 
geEG 


= 0(G)~ y X1(9)(0(G)/o(H)o(C))o(Ci)x2(Cr) 
g€CEC(G),CicC 
Where C; C C means that C; ranges over all the H-classes in C() H and C(G) 


denotes the set of all the classes in G. We clearly have 


x1(g) = o(C)x1(C) 


gEC 
Thus, the above becomes 
<X1,X2 = 
oH) SY) 0(Cr)xi(C)X2(C1) 
C,CCEC(G) 
=o) * So 0(Cr)X1(Cr).X2(Ci) 
CiCCEC(G) 
=0(H)"' SY * 0(Ci)X1(C1).]X2(Cr) 
CiEC(H) 
= 0(H)~!. S~ xi (h)Xa(h) =< Res$ x1, X2 > 
heH 


Where ResG x1 denotes the restriction of the function x; on G to H. In par- 
ticular, suppose y; and X2 are respectively irreducible representations of G and 
H. Then, the above formula can be expressed as 


ma(x1, IndGX2) = mu (X2, Res$x1) 


where m@(x1, xX) is the multiplicity of the irreducible character y; in the expan- 
sion of a character y of G and my(X2,X) is the multiplicity of the irreducible 
character ¥2 in the expansion of a character x of H. This the famous Frobenius 
reciprocity theorem. 


[12] With reference to the previous part, in orthogonal curvilinear coordi- 
nates, qi, q2 for the plane, the electromagnetic field within a cavity resonator 
can be expressed as 


E.(t,d1, 92,2) = 2 Un (Q1, G2).(2/d)!/?.cos(rpz/d).Re(c(n, p)exp(—jw” (n, p)t)) 


n,p2l 


A(t, 4, 42,2) = LS Un(q1, q2).(2/d)'/?.sin(mpz/d).Re(d(n, p)exp(—jw” (n, p)t)) 


n p21 
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E(t, 91, 92,2 = Dan —mp/d).V sn (qr, 92)-(2/d)/?. sin(prz/d).Re(c(n, p).eap(—jw® (n, p)t)) 


— So kn? (V Le n)(d1 42) x2).(2/d)'/?. sin(prz/d).Re(j pw" (n, p)d(n, p).exp(—ju™ (n, p)t)) 


n,p 


FL (t,41,92,2) = D2 kn?-(mp/d).V 1 0n (41, 42)-(2/d)"/? cos(prz/d).Re(d(n, p).exp(—ju™ (n, p)t)) 


np 


+ 37h? (V.tin) (qn 42) x2).(2/d)"”?.cos(prz/d).Re( jew" (n, p)e(n, p)-cxp(—jw" (n,p)t)) 


(see Exercise 2, in step 7). Show that the energy density in this confined elec- 
tromagnetic field, assuming that u,,v, are normalized, can be expressed in the 
form 


= (€/2) i W|2d5(q1, 42)dz + (2/2) } lEH|dS'(q1, 92) dz 
= (1/2) Ylletelap/atin)*)npt? 


+(p/h?)(E(n, p, t)”) +(u+u(p/dk,)?)d(n, p, t)’+(€/k;,)d(n, p, t)7] 
where 
c(n, p,t) = Re(c(n, p)exp(—jw” (n, p)t)), 
Gn, p,t) = Re(jew” (n, p)c(n, Pexp(— jw 
d(n, p, i= i Re(d(n, p jexp(—Jj w (n, p)t )), 
a 


d(n,p,t) = Re(jpuw" (n,p)d(n, p)exp(—jw™ (n, p)t) 


Show that U is time independent, ie, the total field energy in the cavity is 
conserved. 
hint: Write 
c(n, p) = Cr(n,p) + jer(n, p) 


where cr, cy; are real. Then, 
(€ + €(mp/dhn))c(n, p, t)? + (u/hi En, p, t)” 
(€ + e(mp/dhn)”)(er(n, p)cos(w” (n, p)t) + e7(n, p)sin(w” (n, p)t))? 


+(tu/hz,) (ew” (n, p))*(er(n, p)sin(w* (n, p)t) — er(n, p)cos(w” (n, p)t))? 


Now observe that 
e+ ¢(mp/dhn)? = 


(n2 + n2p2/d2)e/h2 = Ww" (n, pep /h? 
and further, 
(u/hi,) (ew (n,p))? =w" (n, p)e?u/hi, 


and hence deduce that 


(€ + €(mp/dhn)*)e(n, p, t)” + (u/hn)e(n, p, t)” = 
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(w” (n, p)*e?u/hr)(cr(n,p)” + cr(n,p)”) 


Likewise, show that 
(u + u(mp/dkn)*)d(n, p, t)” + (€/kp)d(n, p, t)? 


= (wi! (n,p)?u?€/kn)(dr(n,p)” + dr(n, p)”) 


Deduce that the energy of the field is given by 


U = (1/2) }0(w"(n,p)a”(n,p)*a” (n,p) + w (n,p)a™ (n, p)*a" (n, p)) 


where 


a” (n,p) = (\/w8(n, p)eVii/hn)d(n, p) 
a(n, p) = (\/w" (n, p)uve/kn)e(n, p) 


(Recall that c(n,p) = er(n,p) + jer(n, p), d(n, p) = dr(n,p) + jdr(n, p)). 


Explain how you would quantize this electromagnetic field based on the 
introduction of Bosonic creation and annihilation operators by interpreting the 
above field energy as the Hamiltonian of an infinite sequence of independent 
harmonic oscillators. 


[13] Speech —> MRI conversion using artificial neural networks. This problem 
outlines a procedure for predicting the MRI image data of a patient’s brain from 
his slurred speech data using a combination of a feed-forward neural network 
and the extended Kalman filter. We assume that there is a definite relationship 
between the speech signal of a patient and his dynamically varying MRI image 
field. It should be noted that the speech data is a low dimensional signal, say 
of 100 time samples while the MRI is a much higher dimensional signal, again 
of say 100 time samples but each sample is a vector of a very large size. So 
we are predicting a very high dimensional data from a lower dimensional data 
and this enables us to avoid the use of expensive equipment for the purpose. 
Let s(t) denote the speech signal of the patient and f(t) the MRI image data of 
the same patient transformed from a matrix image field to a vector via the Vec 
operation. It is assumed that f(t) = [f,(t),..., fr(t)|7 can be expressed as some 
function of s(t) = [s(t), s(t—1),...,s(¢- L)|?. The neural network is assumed to 
have K layers with each layer having N nodes. Let W(k,l,m),1<k< K,1< 
l,m < N denote the weights connecting the (k — 1)” layer to the k*” layer. The 
input speech signal s(t) is applied at the zeroth layer. Thus, writing the weight 
matrices as 

W(k) = ((W(k,1,m)))<tmen € RN*N 


it follows that the signal vector at the k*” layer is given by 


x,(t) = o(W(k)x,_1(t)), k = 1,2,...,L 
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where 
xo(t) = s(t) 


and the output vector of the network 


y(t) = xx(¢) 


is matched to the MRI process f(t) at each time. Here a is the sigmoidal function 
that acts on each components of its argument, ie, if we write 


W(k)xp_1(t) = 2(t) = [z1(t),..., zr (6)]", 
then 
o(W(k)x—1(t)) = [o(ar(t)), --, o(2n())]” 


The measurement data at time t is the neural network output y(t) plus noise/error 
which we assume to be equal to the true MRI process f(t). To estimate the 
weights of the neural network, we assume a weight dynamics 


Vec(W(k)(n + 1)) = Vec(W(k))(n) + ew(k)(n) 
where ew is a noise/error process. We write the measurement model as 


£(t) = y(t) + v(t) = 


where 
W(t) = [Vec(W(1))(t)7, ws Vec(W(K)(t)*]* 


is the vector of all the weights at time t. The EKF is driven by the output MRI 
signal f(t) and noting that the weight dynamics can be expressed in the form 


W(t+1) = Wit) +ew(t) 


it follows that the EKF can be cast in the form 
W(t + 1|t) = W(tlt), 


W(t + 1lt+1) = W(t+ lt) + K(¢+ 1)(£(t + 1) — h(W(E¢ + 14), s(t + 1) 
K(t+1) = P(t+ 1[t)H(¢ 4+ 1)? (H(t +: 1)7 P(t + 1\t)H(t +1) +P,)71 


Oh(W(t + 1|t), s(t + 1)) 
- ow 
P(t + 1|t) = P(t|t) + PB, 
P(t+1|t+1) = I-K(t+1)H(t+1))P(t+1|t) (I-K(t+1)H(t+1))” 
+K(t+1)P.K(t+1)7 
Derive these equations from first principles and implement this on MATLAB. 
For MATLAB implementation, you can use a two layered neural network. The 
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problem of computing the Jacobian H of h(W,s) involves using back-propagation 
which is an elementary application of the chain rule of differential calculus, ie, 
writing 
Xp = 0(W(k)xz-1), & = 1,2,...,0 
we have 
Oh(W,,s)/OV ec(W(k)) = 0y/OVec(W(k)) = 


Oy Oxp_—1 OXp41 Ox}, 
Oxp-1 OXp-2 Ox~ OVecW(k) 


Write down all the terms involved in this expression explicitly for this model. 


Approximation of multivariate polynomials using a neural network: In or- 
der to characterize the performance of a neural network in approximating a 
given plant function, we require to calculate the mean square approximation 
error involved in approximating polynomial functions using the network. The 
sigmoidal functions used in the network must be approximated again by poly- 
nomials based on truncating their Taylor series upto a given degree N and then 
the minimum mean square estimation error evaluated based on such an approx- 
imation. Consider first a two layer neural network with each layer having two 
nodes. The output vector is then 


y = 0(W20(Wix)) 


with x = [21,22]? as the input and W;,,k = 1,2 are 2 x 2 matrices. Therefore 
in component form, 


Y= [o(W2(11) 21 + W2(12)z2), 0(W2(21) 21 + W2(22)z2)|*, 
where 
A= o(W,(11)az + W,(12)x2), 22> a(W,(21)a, + W(22)x2) 


The aim is to compute the minimum mean square approximation error 
minw,.w. | ((pi(#1, 22) — y1)* + (po(w1, £2) — y2)*)dayday 
D 


over a domain D C R? for a given bivariate polynomials p,,p2. For example, 
we can take D = [a,b] x [c,d]. This problem is the same as minimizing the 
mean square error between the random variables p(x) and y = y(x) when x is 
a uniformly distributed random vector on D. More generally, we can talk about 
minimizing this mean square error when x has any given probability distribution 
F(x) in R?. Writing 


N 
a(z) = y= c(k)z* 


k=0 
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we have that 


N 
a(ayuy + agug2) et c(k)(ayur + agU2)* 
k=0 
) mijk—-m,m,k—-m 
= ayay ulus 
tees a 
Thus, 
k m —m~m —m™m 
Zz = 2 (yay W,(12)*- a ak 
k m —m —m™m 
a= Dey(7 men W,(22)*-™ ak ak 
vi = >_ (hk) i We) "Wanye 
ro a 1 *2 
k m k-—m ky ke 
=D (* wean W2(12) eth) oe(bn) (Hn P) 
SL ree (1D hee eta te ee tg Eee Mel othe 


For fixed 21, #2, this is a polynomial in (W (11), W (12)) of degree Nn. Like- 
wise, the formula for y2 is given by 


k,m 


“EanGa)wserrwscr-narati(s) Cc) 


SW 1a (LDV eee ie ae Ri Rl ery Mt 


Choosing D = [0,1] x [0,1] without loss of generality (since we can in the case 
when D = [a,b] x [a,b] replace the sigmoidal function o by a scaled and trans- 
lated version of it thereby reducing the problem to [0, 1] x [0, 1]. To calculate the 
mean squared approximation error between (yi (#1, %2), y2(%1, %2)) and a bivari- 
ate polynomial pair (pi(a1, 72), p2(%1,22)), we need to evaluate the following 


integrals: 
ih pl ty Atl 
if / yds, | ‘i ysdxdxo, 
0 Jo 0 Jo 
1 pl is ai 
[ f netesanaes, | ip yor, r5dxr1dxr2 
o Jo o Jo 


1 pl 
| : yir{eedr,dxzy = 
0 Jo 


We have 
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Acknowledgement for problem [13]: I’ve borrowed this problem from my in- 
formal discussions with Prof. Vijayant Agarwal, Vijay Upreti and Gugloth Sagar. 


[14] This problem outlines the steps involved in deriving the EKF and UKF 
in discrete time. 

step 1: Let the state vector x(t) € R% satisfy the stochastic difference 
equation 


x(t +1) = f(t,x(t)) + g(t, x(t))w(t4+ 1),t=0,1,2,...-—— (1) 


where 
f:R, x RY >R*,g:R, xRX >RN*”? 


and w(t),t = 1,2,... white noise with zero mean. Its autocorrelation is given by 
(w(t)w(s)") = Q(t)d[t — 5] 


The measurement model is 


z(t) = h(t, x(t)) + v(t),t = 0,1,2,... — — — (2) 


where v(.) is also zero mean white and independent of w(.): 


E(v(t)v(s)") = R(t)d[t — s], 


Z(w(t)v(s)?) =0 


(1) is called the state/process model and (2) the measurement model. w(.) 
is called process noise and v(.) called measurement noise. Note that when 
we say white noise, we mean that its samples are statistically independent, 
not just uncorrelated. If the noise is Gaussian, then uncorrelatedness implies 
independence but for non-Gaussian noises, independence is a stronger condition 
than uncorrelatedness. Let 


Zi = {z(s):0< 5 < t} 


the measurement data collected upto time t. We shall be assuming that x(0) is 
independent of {w(t), v(t) :t > 1} 


Remark: At time t, we have available with us x(¢|t) = E(x(t)|Z,) and 
P(t|t) = Cov(e(t|t)|Z_) = Cov(x(t)|Z,) where e(t|t) = x(t) — x(¢|t). The itera- 
tion process involves computing X(t+1|t+1) and P(t+1|t+1). This computation 
progresses in two stages, first compute x(t + 1|t) = E(x(t+ 1)|Z,), e(¢+ 1]t) = 
x(t + 1) — x(t + 1]t) and P(t + 1|t) = Cov(e(t + 1]t)|Zt) = Cov(x(t + 1)|Zz) 
based on only the state dynamics, and then update these to x(t + 1|t+ 1) and 
P(t+ 1]t + 1) based on the measurement z(t + 1). 


step (2a):Computation of X(t + 1t). 
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Then from the state model, we have that x(t) is a function of x(0), w(1),..., w(t— 
1) and z(t) is hence a function of x(0), w(1),...,.w(t — 1), v(t). Thus, Z; is a 
function of x(0), w(1),..., w(¢—1), v(0), ...v(¢). It then follows from the assump- 
tions made that w(t+ 1) is independent of (Z;,x(t)). Hence, taking conditional 
expectations on both sides of of (1), and using the independence of w(t+ 1) and 


(Zz, x(t)), we get 


R(t + 1[t) = E(x(t + 1)|Z) = 
(F(t, x(t))|Zr) + E(g(t, x(t))w(t + 1)|Zr) 


where 
[(@(t, x(t))w(t + 1)|Zi] 
= E[E[(g(t, x(t))w(t + 1)|Ze, x()]|Z:] 
= Elg(t, x(t) )E[w(t + 1)|Z, x(t)]|Z:] = 
since 
i[w(t + 1)|Z:,x(t)] =Ew(t+ 1) =0 


Remark: if x,y,z are random vectors, then 


4(E(x|y, z)|z) = E(x|z) 


and if x,y are independent, then 


o(x|y) = E(x) 

Prove these statements by assuming joint probability densities. Thus, we get 
x(t + lt) = E(£(t, x(t))|Zz) 

Now if f(t,x) is affine linear in x), ie, of the form u(t) + A(t)x, then it is 


immediate that 
i((f(t, x(t))|Zz) = f(t, E(x(¢)|Ze)) 
= f(t, x(¢|t)) 


In the general case, however, we cannot make this assumption. If f(t,x) is 
analytic in x, then we can Taylor expand it around #(¢|t) 


f(t,x) = u(t) + fo(t, &(t|t)) + ye, A,,(t)(x) — &(t|t))®"/n! 


n>1 


and then get 


[£(t, x(t))|Z ) + $= An(t)ptn (tlt) /n! 


n>2 


where 


u(t) = f(t, £(t|t)) 
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and p1,,(t\t) is the conditional n“” order estimation error moment of x(t) given 
Zi, ie, 


fin(t) = E((x(t) — &(t/t))°"|Zz) 


However if we were to implement a filter based on such an approach, it will 
become an infinite dimensional filter, ie, at each time step, we have to update 
the conditional moments of all orders. The EKF is a finite dimensional approxi- 
mation to such an infinite dimensional filter in which we neglect the conditional 
moments of all orders greater than two. Thus, we get an approximation 


z[f(t, x(t))[Zi] = F(E, X(t) + (1/2) fro (t, REE) V ec( P(E) 


We note that 


fa(t|t) = Vec(P(t|t)) 


Most authors also neglect the second term here and simply make the approxi- 
mation 


af (t,x(t))|Z] ~ £(t, x(¢|t)) 


The UKF on the other hand, gives a better approximation even than that ob- 
tained by truncating in the above way upto a given order of moments. It is 
based on evaluating the conditional expectation E(f(t,x(t¢))|Z,) using the law 
of large numbers. Specifically writing 


e(t|t) = x(t) — X(t|t) 


and defining 
P(t|t) = Cov(e(t|t)|Z,) = Cov(x(t)|Zz) 


we choose a sequence €(m),m = 1,2,...,K of iid N(0,In) random vectors in- 
dependent of Z; and note that by the law of large numbers, 


K 
(1/K) $5 f(t, X(¢|t) + V/PCHE(m 
m=1 


conditioned on Z, converges to E(f(t, x(t))|Zz) as kK — oo provided that con- 
ditioned on Z;, e(¢|t) has a normal distribution. Thus, the result of the UKF 
is 


K 
R(t + It) = (1/K) SY F(t, &(t|t) + VP(ElHE(m 
m=1 


Note that this is also an approximation. However, if we assume that e(t|t) con- 
ditioned on Z; has a probability distribution Fy ,,(e) and we choose ¢(m),m = 
1,2,...,& conditioned on Z; to be iid with this distribution Fy ¢,(e), then by the 
law of large numbers, (1/K) Ser f(t, x(t) + ¢(m)) will converge as K — co 
conditioned on Z; to E(f(t,x(t))|Z,). This defines the first stage of the EKF 
and the UKF. 
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step (2b): We now have to compute 
P(t+1|t) = cov(x(t+1)|Z:) = Cov(x(t+ 1) —x(t4+ 1|t)|Zi) = Cov(e(t+1]t)|Z:) 
The EKF computes this using the following: 
e(t + 1|t) = x(t +1) — f(t, x(t|t)) = 
f(t,x(t)) + g(t, x(t))w(t + 1) — f(t, X(t|t)) 
x f,(t,x(t|t))e(¢|t) + g(t, x(¢|t)).w(t + 1) 
and hence 
P(t + It) = fo (t, X(t|t))P (Hd) £0(t, X(E|t))” + g(t, X(¢|4) ) Q(t)-g(t, X(t|t))” 


In the UKF, we are not allowed to make the approximation f(t, x(t|t)) for 
i(f(t, x(t))|Z,). Instead we must use the independent realizations f(t, x(t|t) + 
/P(t|t)E(m)),m = 1,2,...,& of f(t, x(t)) = f(t, x(¢|t) + e(t|t)) conditioned on 
Z,. Thus, in the UKF, based on the large numbers, we compute 


P(t + 1|t) = 
K 

(1/K) S> (£(t, (tlt) +P (EE) (m)) —X(t+ 1 t)).(£(t, &(E|t) +-V/P (Ht) E(m))—K(t+1]e))7 
m=1 


step 3: The EKF computation of x(t+ 1|t + 1) and P(t + 1l]t+ 1). 


R(t + lft +1) = E(x(t +. 1)|Zis1) = E(x(t + 1)|Zz,2(t + 1) 


In the EKF, we assume that the extra measurement modifies x(¢ + 1|t) by an 
additive term proportional to the output error at time t+ 1, ie, the difference 
between the true output measurement z(t + 1) and its estimate z(t + 1|t) = 
E(h(t + 1,x(t + 1))|Z,) based on Z;. Thus the EKF gives 


X(t + I]t +1) = X(t + I]t) + K(t + L(z(t + 1) — h(t, X(t + 1]2))) 


where E(h(t+1,x(t+1))|Z:) has been approximated by h(t+1,Ex(t+1)|Z:)) = 

h(t + 1, x(t + 1|t)) Again the conditional expectation has been pushed inside 

the nonlinearity. This algorithm for updating the conditional expectation based 

on the newly arrived measurement is based on the fact that if at time t, the 

output estimation error is ” positive”, then we increase proportionally the state 

estimate while if it is negative, we decrease proportionally the state estimate. 
The ”Kalman gain” K(t+ 1) is computed so that 


D(|| (E+ 1) — RE + At + 1) ||? [Ze) = Ell] e(t + Lé + 1) ||? [Ze] 


is aminimum. We note that 


e(t+1/t+1)=x(t+1)-X(t+1t+1) = 
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x(t +1) —&(t + 1|t) — K(t + 1)(a(t + 1) — h(t, &(¢ + 1]4))) 
= e(t + 1\t) — K(t+ 1)(h(t + 1, x(t + 1)) — h(t, X(t + 1]t)) + v(t + 1)) 
~ (I—K(t+ 1)H(t + 1))e(t + 1|t) + K(t+ 1)v(t +1) 


where 
Oh(t + 1, x(t + 1]t)) 


Ox 


H(t+1)= 


and hence 
E|I] e(é + 1é + 1) ||? |Z] = 


= Tr[(I—K(t + 1)H(t 4+ 1))P(t 4+ 1/t)I- K(¢+ 1)H(t +1))* 
+K(t+1)R(t+1)K(t+ 1)7] 


Minimizing this w.r.t K(t + 1) using the variational calculus for functions of 
matrices gives us the optimum Kalman gain as 


K(¢+ 1) =P(t+1))H(t+ 1)7 (A(t + YP(t+ HH +1)? + R41) 


The optimum value of P(t + 1|t +1) is then obtained by subsituting this value 
of the optimal Kalman gain into the expression 


Ele(t + 1é + l)e(t + 1]t + 1)7 |Z] = 


= [I-K(t+ 1)H(t+1))P(t4+ 1t)(I- K(t + 1I)H(¢ + 1))7 
+K(t+ 1)R(t+1)K(t+1)7] 
and assuming that P(t + 1|¢+ 1) is a function of only Z; and not of Ziz1 = 
(Z,,z(t + 1)). The above expression then yields 
P(t+1|t¢+1) =(Q-K(t+ 1I)H(t+1))P(t+ 1) -— K(t + )H(¢ + 1))" 


+K(t+ 1)R(t+1)K(t+1)7] 
= (I- K(¢+ 1)H(t + 1))P(¢ + 1]¢) 
= P(t+1|t)—P(t+1|t)H(¢+1)7 (H(t+1)P(t4+1|t)H(t4+1)7 
+R(t+1))~*H(t+1)P(t+1|t) 
Exercise: Rewrite the expression for P(t+1|t+1) using the matrix inversion 
lemma. 
Remark: Note that the above expression for P(t + 1|t + 1) shows that 
P(t+1jt+1) < P(t+1|t) 
which in particular implies that 
Tree le |) = Tr P er 1s) 
This means that if we base our state estimate on an extra data point, the 
estimation error variance reduces. This is natural to expect. 
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step 4: The UKF calculation of x(¢+ 1]t + 1) and P(t + 1l|t+ 1). 
Here, we start with the expression 


R(t + 1] +1) = Elx(t + 1)|Z,2(t + 1] 


and assume that given Zz, (x(t+1),z(t+1)) is jointly Gaussian. This is justified 
since x(t+1) = X(¢+1|t)+e(t+1|t) and z(t+1) = 2a(t+1|t)+e,(t+1]t) which 
means that the assumption amounts to saying that given Z;, (e(t + 1|t),e.(¢ + 
1|t)) are jointly Gaussian errors. Next, we use the fact that if U, V are jointly 
Gaussian vectors, then 


a(X/Y) = px + UxyDyy(¥ — py), 


Covu(X|/Y) = Uxx = ee eae 


where 
xx = Cov(X), hyy = Cov(Y), “xy = Covu(X, Y), 


yx = E(X), py = E(Y) 


Based on these assumptions and formulae, we have the approximate formulae 


R(t + I]t +1) =E(x(t + 1)|Z,,2(t + 1)) = 


R(t + 1]t) — Cov(x(t + 1), a(t + 1)|Zt).Cov(z(t + 1)|Zz)~* (z(t + 1) — a(t + 1]t)) 


where 
Cov(x(t + 1), a(t + 1)|Z,) = Cov(x(t + 1), h(t + 1,x(t4+ 1))|Z:) 


= Cov(e(t + lt), h(t + 1, x(t + 1]t) + e(t + 1]t))|Ze) 


K 
= (1/K) SS JP(t + 1t)n(m) (h(t+1, X(¢41|t)+ VP(t + 1]t)n(m))—2(t41]t))7 


where 


K 
a(t + 1|t) = (1/K) $~ h(t +1, X(¢ + It) + VP + 1) n(m)) 


where 7(m),m = 1,2,...,K are again iid standard normal random vectors. 
Moreover, 
Cov(a(t + 1)|Z,) = Cov(h(t + 1, x(t + 1))|Z) + RE + 1) 


= Cov(h(t + 1, X(t + 1|t) + e(t+ 1]t))|Z:) + R(t + 1) 
K 
= (1/K) S$ (h(t+1, &(t+1\t) +P (E+ 1t)n(m)) 


—2(t+1|t)).(n(t+1, x(t4+1|t)+/VP(¢+ 1]t)n(m))—2(t+1|t))7 
+R(t+1) 
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and 

P(t + lt + 1) = Cov(x(t + 1)|Zt, z(t + 1)) 
= Cov(x(t+1)|Z,)—Cov(x(t4+1), 

z(t+1)|Z,).Cov(z(t+1)|Z,)+.Cou(x(t+1), 2(t+1)|Z,)7 
where all the terms except the first on the rhs have been computed above. The 
first term is 

Cov(x(t + 1)|Z:) = P(t + 1|t) 

This completes the description of the EKF and the UKF. 


step 5: Performance analysis of the UKF based on the large deviation prin- 
ciple. 
[15] The Belavkin filter and how it improves upon the classical Kushner- 


Kallianpur filter. The unitary evolution in system ® bath space is given by the 
Hudson-Parthasarathy noisy Schrodinger equation 


dU(t) = |-((H + P)dt — LdA + L*dA*|U(t), P = LL* /2 

For any system space observable X, we define the system state at time t to be 
X(t) = 5e(X) =U(t)"XU(2) 

and by quantum Ito’s formula, obtain 

Uje(X) = je(Oo(X) dt + je(O1(X) ad A(t) + je(O2(X))dA(t)* 
where 
0)(X) =i[/H, X])—XP-PX+LXL* = i[H, X|—(1/2)(LI*X+XLL*—-2LXL*) 

64(X) = [L, X],00(X) =[X, L*] = 6:(X)" 
The measurement model is 
Y(t) = U(t)*Y,(t)U (0), Y;(t) = EA(t) + cA(t)*,c EC 


Quantum Ito’s formula is 
dA.dA* = dt 
We have by Quantum Ito’s formula, 
dY (t) = dY;(t) + j:(cL + L*)dt, j,(Z) = U(t)*ZU(t) 
for any Z defined in h @T,(L?(R4)), where 6 is the system Hilbert space and 
T',(L?(R,)) is the bath Boson Fock space. Note that U(t) is a unitary operator 


in h@T,(L7(R,)). In classical filtering theory, the state X (t) evolves according 
the a classical sde: 


dX(t) = f(X(t))dt + g(X(#))dB(t) 
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where B(.) is classical Brownian motion and the measurement model is 
dy (t) = h(X(t))dt + oy.dV (t) 


where V(.) Brownian motion independent of B(.). In classical probability there- 
fore, the homomorphism j; acts on the commutative algebra of real valued func- 
tions ¢ on R” where X(t) € R” and is defined by 


and it satisfies the sde 


dje($) = 6 (X(t))dX (t) + (1/2)Tr(gg7 (X (t)) phi" (X(t) at 
d 
= ji(K¢)dt +S) Bm(t)ie(Gem(X (t))Drd) 


k,m 


where 
K = f? D+ (1/2)Tr(gg’ .DD*) 


is the generator of the Markov process X(t) and D = 0/0X is the gradient 
operator. In the scalar process case, there is just one state variable and one 
Brownian motion B(.) and then the above simplifies to 


djr(d) = je(Kb)dt + jr(gDo)dB(t) 


where 


K = fD + (1/2)g’°D?, D = d/dX 
The measurement model in the classical case can be expressed as 
dY (t) = j,(h)dt + oydV(t) 


In the quantum case, the Hermitian operator cL + cL* plays the role of h and 
cA(t) + cA(t)* plays the role of the classical measurement noise oy V(t). Note 
that GA(t) + cA(t)* by itself is a classical Brownian motion process in any state 
of the system and bath with a variance of |c|? in place of o7. The quantum 
analogue of the classical generator K is the quantum Lindblad generator 6p. 
Note that the classical observable ¢ is replaced by the quantum observable 
X and $(X(t)) = je(d) by j(X) = U(E)*XU(E). 01(X), 02(X) = 01(X)* are 
reduced in the classical theory to gD¢ if we take B(t) = A(t)+ A(t)*. Note that 
in the quantum scenario, the process noise A(t) + A(t)* generally is correlated 
with the measurement noise ¢A(t) + cA(t)*. It is uncorrelated only when we 
choose c to be purely imaginary. 

Remark: In order to see the classical-quantum analogy better, we must relate 
the classical theory to classical mechanics, Hamiltonians etc. Thus, we consider 
the Langevin equation for a classical particle moving in a potential U(q). The 
stochastic differential equation of motion of such a particle is given by 


dq(t) = p(t)dt, dp(t) = —(yp(t) + U"(q(t)) dt + g(a(), p(t) dB (t) 
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The state of the system at time t is now 


and the homomorphism j; is 


Then, by Ito’s formula, 
dje(d) = je( KO) + je(SG)dB(t) 
where 
Ko = pdo/0q — (yp + U'(q))00/Ap + (1/2)g°0"g/ Ap? 
Noting that the Hamiltonian of the particle is 
H = p?/2+U(q) 
we can write 
Ko = {¢, H}p — yp0o/Op + (1/2)g°0"G/ Op? 


where {.,.}p denotes the Poisson bracket. To see what the quantum general- 
ization of this is, we choose L = W(q,p) where now q,p are Hermitian operators 
satisfying the commutation relations 


a, p] = ih/2n 
Then, 
6o(X) = i[H, X] — (1/2)(LL* X + XLL* — 2LXL*) 
= i[H, X] — (1/2)(L[L*, X] + [X, L]L*) 
Taking 
X = $(@P) 
we have 


i[H, X] = i[p?/2+U(q), ¢(4,p)] = 
(1/2) ([p, dlp + p[p, o]) + 7[U (9), ¢] 
= (1/2){0,¢, p} + a[U, 4] 


where {.,.} denotes the anticommutator. In the classical case where Poisson 
brackets are replaced by Lie brackets, this expression becomes 


{¢, H} p = pOq — (OqU).(Op¢) 


but we see that in the quantum case, there are additional factors in view of the 
non-commutativity of g and p that are expressible as a power series in Planck’s 
constant. For example, suppose 


(4, P) = bo(@) + {¢1(9), P} 
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then we get 


i[H, X] = (1/2){0q¢0(4), p} + (1/2){{0q61(4), P}, PH 


In the classical case, these equations simplify as 


(4, P) = %0(9) + 2p¢1(4), 
and i[H, X] gets replaced by 


p0q0(q) + 2p?0,61(q) 


A further special case of this is when 


X = $(4,p) = ¢0(@) 


Then, 
i|H, X] = (1/2){0q¢0(4), Pt 


while in the classical case, this reduces to 


p0qho (q) 


As for the Lindblad terms, in the quantum case with X = ¢(q,p) and L = 
w(q,p), we find that the quantum version of 


—7p09/Op + (1/2)g°0"4/Op" 
should be simply 
21h ELE OER El (NI, ME) 
where 


X = o(qD) 


and L = w(q,p) chosen appropriately in terms of the classical function g(q, p). 
For example, choosing 
L=aq+bp,a,b€C, 


we get : - 
[L*, X] = [aq + bp, O(4, p)] = 140, — ibdg@ 
[X, L] = ia, + ibd, 
L[L*, X] + [X, L]L* = (aq + bp) (iG, — ibd, 4) 
+(—iadpd + ib0,¢) (aq + bp) 
= ila|?[q, Op] + iba(p.Opd + Oq¢-q) 
—iab(q0gb + Op-p) — i|d/"[p, Og] 
= —|a\?O5@ — |b|?7@ 
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iba(p.Opd + Ogo.q) — iab(qdq¢ + Opd-p) 
(Remark: To get some sort of agreement with the classical case, the term in- 
volving a7 should not appear. Thus, we set b = 0, ie, 


DL=aq 


in which case, 
LIL*, X] + [X, LJL* = —|a|?02¢ 


but then we are not able to get the damping term 7.p0,@). So we see that if 
we constrain the Lindblad operator LE to be linear in q and p, we are able to 
obtain some sort of a quantum analogue of the classical Langevin equation but 
with some additional terms. Moreover, by restricting L to be linear in q,p, we 
cannot in the quantum case account for a general diffusion coefficient g?(q, p) 
dependent on q,p present in the classical case. 


Upto this, we have dealt with drawing analogies between the classical Fokker- 
Planck equation for stochastic differential equations in classical mechanics and 
probability on the one hand and quantum stochastic differential equations in 
quantum mehcanics and quantum probability on the other. Now, we try to 
draw analogies between the classical filtering and quantum filtering equations. 

First we note that in the quantum case, the measurement noise process 
Y(t) = GA(t) +cA(t)* is also a Brownian motion with a variance parameter |c|? 
and the process noise in dj,(X) appears as j;([L, X])dA(t) + j¢([X, L*])dA(t)*. 
This measurement noise is generally correlated with the process noise unless 
it happens that L* = L and c is real, in which case we get that the process 
noise appears as j:([L,X])(dA(t) — dA(t)*) with [L,X]* = [X,L] = —[L,X] 
while the measurement noise differential is c(dA(t) + dA(t)*). Another case in 
which this happens is when c is pure imaginary and L* = —L in which case, 
it happens that the process noise appears as j;([L, X])(dA(t) + dA(t)*) (since 
[L, X]* = [X, L*] = —[X, L] = [L, X]) while the measurement noise differential 
is €(dA(t)—dA(t)*). These two cases are the only cases in which the process noise 
and measurement noise are independent Brownian motions as in the classical 
model for nonlinear filtering. 

The Belavkin filter is obtained by denoting the non-demolition measurements 
upto time t by 

Zi =0(Y(s):5<t) 


and writing 


U(je(X)| Zz) = m(X) 
and noting that 7,(X),¢ > 0 form an Abelian family of operators along with Z; 
and hence we can assume that they satisfy an equation 


dr, (X) = F,(X)dt + G;,(X)dY (t) 
where F;(X),G,(X) are Z, measurable. Then assuming that 


dC(t) = f(t)C(t)dY (t),C(0) = 1 
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we get that C(t) is Z-measurable and therefore by the basic orthogonality 
principle in signal estimation theory, 


[(ie(X) — m(X))C()] = 0 


[16] Simultaneous application of the representation theory of the permuta- 
tion groups and the Euclidean motion group in three dimensions to 3-D image 
processing problems. 

Assume that we are given n 3 — D objects whose centres are located at the 
positions rj,j = 1,2,...,.n. The k*” object whose centre is located at rz will 
emit a signal f,(r —r;,). Thus, the total signal emitted by all the n objects is 
given by 


X(riri,..,0%n) =) = Spee — Lx) 
k=1 


Now suppose we permute the objects by applying a permutation 0 € S;, and 
also rotate the entire array of objects by the rotation R € SO(3). Then the 
resulting signal field becomes 


n 


Y(rlri,....2%e) = S- fe(R-'y — r,-1%) 


k=1 
SR rl rari) 


From measurements of the signal fields X and Y, I wish to estimate the rotation 
R and the permutation o. Let 7 be a unitary representation of SO(3) and 7 a 
unitary representation of S,,. We compute 


as, Y (t|Pp1, --- Fon) n(p) 


pESn 


= ae MRP Raia isto) 
pESn 


=[S5 X(R ele ps, ton)n(p)"In(o)" 


PESn 
Also, 


| Y (Srlri, ..., Pn)a(S)*dS 
$0(3) 
= f XR1Srr yay, to1n)m(8)"aS 


= [ XStiteou. wy Pg-in)T(RS)*dS 
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= Ef X(Srlte i sto-1n)a(8)*aS]n(RY 


More generally, we then have 


iS [Celt tm) M18)" 8 n(0)"as 


PESn 


=[0 [ Xt, “1 ¥pn)™(S)* @ n(p)"dS](7(R)* @ n(o)") 


pEsn 


This equation gives us a clue to how linear estimation theory can be applied to 
estimate both R € SO(3) anda € Sy. 


[17] (Part of a B.E. project) In quantum mechanics, probabilities of events 
are computed w.r.t a given state of the system. If the system is in a pure state 
| >, then the probability of an event descrribed by the projection P is given 
by < ¥|P|w >. Many times, calculating the pure state wave function |W > 
becomes very complicated because we have to solve a Schrodinger equation for 
that. A typical example is that of a two electron atom like Helium where the 
wave function is a function of two position variables, ie, a function of six real 
variables. However, there do exist approximate ways of calculating the required 
wave function. One such is the Hartree-Fock method in which for example in 
a two electron atom, we know that the wave function must be antisymmetric 
w.r.t interchange of the two position and spin variables owing to the Pauli 
exclusion principle which states that two electrons cannot occupy the same 
state, ie, have the same positions and spins. The Hartree-Fock approximation 
involves assuming that the wave function is a product of position wave functions 
and spin wave functions with one of them being symmetric and the other one 
antisymmetric w.r.t interachange of the two electrons. Further, we assume that 
the position part of the wave function if antisymmetric can be represented as the 
antisymmetrizer of the product of single particle position wave functions and 
if symmetric, can be represented as the symmetrizer of the product of single 
particle position wave functions with the same argument being valid for the 
spin wave functions. 


T outline here most of the steps involved in doing the Hartree-Fock simulation 
for a two electron atom with interacting spins and angular momenta. The 
Hamiltonian without taking spin or orbital momentum interactions is given by 


Hoi + Ho2 + Vi2 
Where 
Hoi = p?/2m — 2e?/r1, Ho2 = ps /2m — 2e2/r2Vi2 = e? /|r2 — 11 


Where 
Pi = —1V1,p2 = —1V2 
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The magnetic field produced by the two nuclear protons at the site of the first 
electron in view of their relative motions is given by 


Bpi(r1) = ev, X ri/r3 = e111 /r3, Ly =X pi 


The interaction energy between this magnetic field and the total spin orbital 
magnetic moment 
M, = (—e/2m) (Ly + go1) 


Of the first electron is given by 
Vor = —(Mi, Bpi(r1)) = er (Li + go1,L4)/r} 


Likewise the magnetic interactions between the proton and the total magnetic 
moment of the second electron is given by 


Vp2 = C2(L2 + go2, L2)/r3 


Now the first electron moving with a velocity of v1 produces at the site of the 
second electron moving with a relative velocity of vo —v, w.r.t the first electron, 
a magnetic field at the site of the second electron given by 


Bmoi = —e(vy—v2)2(r2—11)/|re—11|? = (e/m)(—L,—L2—pi2r2 p2xr1)/|\r2—r1|° 
Since 
VY. = pi/m, ve = po/m, Ly = 11 xpi, Lz = rexpe 


and further the spin magnetic moment of the first electron produces a magnetic 
field 
Bsa = curl2((—geo./2m)a(r2 — 71)/|r2 — 71|°) 


Remark: A magnetic moment m at the origin produces a magnetic vector po- 
tential 
A=mxr/r3 


And a magnetic field 
B=curlA = curl(m x r/r*) = —curl(m x V(1/r)) 
= —mV*(1/r) + (m, V)]V(1/r) 
= (m,V)((-r/r°) = —me(Oz(t/r*)) — mydy(r/r*) — mz02(r/r*) 
= —m,(#/r? — 3rax/r°) — m,((hata/r? — 3ry/r°) — m.(&/r? — 3rz/r°) 
= —m/r? + 3(m,r)r/r° 


Thus the interaction energy between the total magnetic moment of the second 
electron with the magnetic field generated by the first electron due to its relative 
motion and spin is given by 


(Bmai + Bear, (e/2m)(L2 + go2)) 
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= (e/2m) (Lo + go, (—Li — L2 — prer2 — preri))/|r2 — ri? 
+(ge/2m) (Lo + go2,01/|r2 = r|° = 3(01, 12 _ r1)(v2 = r1)/|r2 = r1|°) 


Likewise, the interaction energy between the total magnetic moment of the first 
electron with the magnetic moment of the first electron and the magnetic field 
generated by the second electron due to its relative motion and spin is given by 


(Bmiz + Bsiz, (e/2m)(L1 + gor)) 


(e/2m)(Ly + gor, (—Ly — Le — pyzr2 — pexry))/|re —14|° 
+(ge/2m) (Ly + go1, 02/|r2 _ r|° + 3(c2, T2 — r1)(v2 - r1)/|re _ r1|°) 


Remark: There are other ways to calculate the interaction energy between 
the two electrons involving magnetic fields produced by them and their magnetic 
moments. They give different results. For example, the magnetic field produced 
by the first electron based on its motion and spin magnetic moment is given by 


By(r) = -ev, x (r—11)/|r — r,|° +curl(m, x (r —11))/|r — r1|°) 


= —ev, X (r—11)/|r r|° m,/|r |? + 3(my, 7 —1)r—11/|r— 11° 


Where 
m, = —geo,/2m 


Is the spin magnetic moment of the first electron. Likewise, the magnetic field 
produced by the second electron is 


Bo(r) = —evg x (r — r2)/|r — ral? + curl(me x (r — r2))/|r — r2|?) 


= —ev, x (r—12)/|r — ral? — mo/|r — ral? + 3(me, Fr — ra)r — re/|r — rel” 


The total energy in the magnetic field produced by the two electrons is 


Ep = (u/2) / |Bi(r) + Ba(r) Par 


And the interaction part of this energy is clearly 


(11/2) i: (By(r), Bo(r)) dr 


From the above consideration, it is clear that the magnetic interaction energy 
must be a scalar operator built out of the vector operators pi, p2, re — 11,01, 02) 
. It is then easy to see that this interaction energy must have the form 
fi(\r2—11|)(p1, pa) +- fe (\r2—-r1|) (P1, 2) +-f2 (72-11) (pe, 01) + fa (lr2—11|) ((r2-11) 


xpi, 02)+f3(|ra—ri|)((ri—r2) Xp2, 01) + fa(lr2—11|) (01, 02) + fs (\r2—-11|) (01 X 2, 72-11) 


In order to formulate the Hartree-Fock equations taking spin into account, we 
must therefore First discretize the spatial region into N*® pixels, N being the 
number of pixels along each of the xyz coordinate axes. Then pix, Piy, Piz, 215 Y15 21 
are each represented by N® x N® matrices, ie, these act in the Hilbert space 
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CN*x1. Likewise For Pras P2y,P2z,T2, Y2, 22. These act in another independent 


Hilbert space CX” Each of the spin matrices 012, 71y,01z is represented by a 2x2 
Hermitian matrix, ie, these act in the Hilbert space C?. Likewise, o22, O2y, 022 
act in another independent Hilbert space C?. In this The total of all the above 
operators therefore acts in the tensor product Hilbert space 


H=C™ @C™ e@CeC=c” 
The Hamiltonian of the system thus has the following decomposition: 
A = Hy, @ I3n3 @ Ig ®@ Ig + I3n3 @ He ® In ® Ig 
+V 


Where V acts in the joint tensor product space and can be decomposed as 


Pp 
v= Soin @ Var ® Vax ® Vax) 
k=1 

Where Vox is a function of only the position and momentum variables of the first 
electron, and thus acts in C?% - namely the first tensor product component in 
H. Vox is a function of only the position and momentum variables of the second 
electron and thus acts in the second tensor product component C?% * of H. V3p 
is a function of only the spin matrices of the first electron and therefore acts 
in the third tensor product component C? and finally, Va, is a function of only 
the spin matrices of the second electron and therefore acts in the fourth tensor 
product component C?. Accordingly, in the Hartree Fock approximation, we 
can assume that the state w~ of the two electrons has one of the following forms: 


a 
: C(t @ ba — Y2 @ ~)| ++ > 
b] 
C(t ®@ 2 — 2 ® ~1)|-- > 
¢| 
C(t1 @ Y2 — 2 @ Y1) (1 > + >)/v2 
d] 
C1 @ ta +2 @~)(l+—>-|-+>)/v2 
Where wv, is a vector in the first tensor product component, 72 in the second, and 
objects such as |++ >,|—— >,|+— >,|—+ > have have their first component in 


the third tensor product components and their second component in the fourth 
tensor product component. Substituting for example the first expression [a] for 
w gives us on noting that H, and Hp are identical matrices so are Vix and Voz 
and so also are V3, and V4, for each k the following expression for the average 
of H in the state |W >: [a] 

< o|A|y >= 


207 [< Wi |Ailvi > + < v2|Mi|be > — < dilMi|v2 >< vel > 
— < ve|Mild1 >< di\W2 >] 
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Pp 
+0°S- < 18 W2—-U2@Y1|Vir@® Voely1@Y2—-Y2Oy1 >< ++|V3r@Vaz| ++ > 
k=1 


-_ 207 [< |My |o1 > + < W2|Ai le > — < dy |AMi le >< weld > — < ve|Aily1 >< Yi\y2 >] 


Pp 
+20? S7[(< dilVinlds >< W2|Vielbe > — < dilVirlbe >< de2|Vaeler >)| < +[Vanl+ > [7] 
k=1 


We are putting the normalization constraints: 
< dil >= 1 =< poalp2 >, 
C7 (1 = | < Ui\v2 > |?) —w 
We must now apply the variational principle to extremize < w|H|w > w.r.t 


|v > and |W > subject to the above constraints. 


Likewise, for the second case [b] 
< $/A|b >= 


207 [< Wy |Ai ly > + < Ye|Ai 2 > — < wi |Ai lve >< Yo|yi > — < Ye|Ailyi >< di |\y2 >] 
Pp 

+20? S7I(< dilVaelhr >< dalVirlbe > — < VilVinlve >< Y2|Vialdr >)| < —|Varl— > [7] 
k=1 


For the third case, [c] 
< P|H|p >= 


207 [< wi|Aildi > + < d2|Mi|v2 > — < Yi|Hild2 >< wold > 
— < o2|Ai|d1 >< Yilye >| 


P 


+20? S“[(< W1|Virldi >< PalVirlee > — < di|Viklbe >< Pe|Vieler > 
k=l 


x(1/2)(< + —|+ < —+]|)(Var ® Vax)(| + -— > +] - + >) 
Now observe that 
(<+—|+<—+|)(Ver ® Vax)(]+—-—>+|-+>)= 
< + —|V3, ® Vaz] + -— > + < — 4+ |V3x @ Vax) -— + > 
+ <+— |Vex ® Vaz| -—+ >+<—+ |Var ® Vaz] + —- > 
and finally for the fourth case, [{d] 


< p|H|y >= 
207[< Wi|Ai ld > + < v2|Mi|v2 > + < Yr |Ai |e >< poly > 
+ < eli li >< Yilye >]+ 
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20° Ss WiVinldr >< PalVirlbe > + < di|Mielbe >< v2|Vielyr >] 
k 


x (1/2)(< +-|- <- +|)(Var ® Var)(| + - > -|- + >) 

We work out the details for case [a]. Assume that the overlap constant 
C? is fixed and then after taking into account normalization constraints using 
Lagrange multipliers, the functional to be extremized is 
S=< Y|Aly > -—Ei(< dildi > —1)—Ea(< pele > -1) 

Fy2(1-1/2C?-| < y1|¢2 > |?) 


Setting the variational derivative 


§3/5h* =0 
here, gives us 
2C? [Ay |b > — < veld > Mile > — < Y2|Hi|d1 > ve >] 


+2075 "|| < +|Vae|+ > |?(< va|Vielee > Viel > — < b2|Virlds > Virlve >) 
k 


—Ey\y1 > + Fiz < aly > v2 >= 0 
Setting the variational derivative 
565/dw5 =0 
gives us 


207 [Ai |v2 > — < vi|Aily2 > |y2 > — < dilde > Ail >] 


+2075 "|| <+|Virlt > [?(< vilVieldr > Viele > — < dilVielbe > Vieldr >) 
k 


— Fale > +Fi2 < 1i\v2 > |Yi >= 0 
If we put the restriction of no overlap, ie 
< p2|y1 >= 0 
then we get C? = 1/2 and the above equations simplify to 


[Aili > — < v2|Aildi > [2 >] 
+ Sol] < +1Vaelt > ?(< v2lVielbe > Virlba > 
: — < Wo|Vieldr > Viele >)—Eily1 >= 0-——(a) 
[Hilde > — < ¥1|Ailye2 > [bo >| 
+01] <+HMialt > P(< vilVinlvr > Viele > 
— <d1|\Virlbe > Virlyr >)—Ea|2 >= 0-——(8) 
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Note that (a) and (b) are consistent with the no overlap requirement < Y2|v1 >= 
0 as follows by premultiplying (a) by < q2| and (b) by < yy| and obtaining an 
identity. It is clear that in the time dependent case, we have to replace the 
energy values E, and E» by i0/0t. This is a consequence of the fact that under 
the no overlap condition, 


< U1 @ 2 — o2 @ W1|A|d1 @ do — 2 @ 1 >= 
<1 |A|y1 > + < Y2lA|v2 > 


since we are assuming that |, >,/ = 1,2 are normalized. 


Remarks: |y > is the overall wave function of the positions and spins of 
the two electrons. The forms of |~ > assumed in [a], [b], [c], [d] ensure that 
it is antisymmetric w.r.t the interchange of the positions and spins of the two 
electrons. For example, in [a], [b] and [c], the wave function is antisymmetric 
w.r.t. the interchange of the positions of the two electrons and symmetric w.r.t 
the interchange of the spin of two two electrons. Hence, since the product of a 
minus and a plus is a minus, the overall wave function is antisymmetric w.r.t the 
interchange of both the positions and spins of the two electrons. In [d], the wave 
functions is symmetric w.r.t the interchange of the positions of the two electrons 
while it is antisymmetric w.r.t the interchange of the two electron spins so once 
again the overall wave functions is antisymmetric. We could represent such 
wave functions alternately as (11, 81,12, $2) where r;,, and s, are respectively 
the position and z-spin component of the k“” electron k = 1,2. Thus in the case 
al, 

W(r1, 1/2, r2,1/2) = C(d1 @ Y2 — 2 @ 1) ("1 r2) 


= C(w1 (71) W2(r2) — 1 (12)W2(11)) 


and ¢)(r1, $1, 2, $2) is zero for the choices (1/2, —1/2, ), (—1/2, 1/2), (—1/2, —1/2) 
for (51,82). We can treat this wave function as a four component wave func- 
tion of the position variables. Likewise, for the wave function of [d], it equals 
C(w1(r1)2(r2) + 1 (r2)2(r1))/V2 for the choice (s1,82) = (1/2,—1/2), the 
negative of the same for (51,52) = (—1/2,1/2) and zero for the remaining two 
choices of (81,52). The antisymmetry of these wave functions are therefore 
equivalently expressed as 


W(r1, $1, 42, $2) = —W(Lo, 82,11, $1),¥1,V2 E R?, s1, so = +1/2 


Now we come to the point of how to express the spin-orbit interaction terms in 
the form Se Vip ® Vax © V3x ® Vaz. The general form of this interaction terms, 
as discussed above is of the form 


(fi (11,2, P1, P2, 71) 


+(fi(ri, Yo, Pi, P2, 02) 
where 
(f,(r1,%2,P1, P2, 71) 


84 Advanced Probability and Statistics: Remarks and Problems 


= fie(¥1,%2,P1,P2)o12 + fiy(¥1, 02, P1, P2)F1y + fiz(t1,02,P1,P2)o1z 


and likewise for the second term. The zx, y, z components of the vector operators 
r1,Pi act in the Hilbert space H; = L?(R°) while the components of re, p2 act 
in another independent and identical Hilbert space Hz = L?(R°). The x,y,z 
components of a; act in another independent Hilbert space C? and finally, the 
x,y, z components of o» act in yet another independent Hilbert space H4 = C?. 
All the components of the Hamiltonian then act in the tensor product Hilbert 
space 
H =H, ® He ® H3 8 Ha 


After discretization, Vi, V2, become N® x N® matrices while Var, Var, are 2 x 2 
matrices. H becomes the finite dimensional Hilbert space C4N’ = C™ @C% @ 
C? @ C2. For example, consider an interaction term of the form 


fi(@1, 1, 21) fo(@2, yo, 22)PieO1e = tf (1, y1, 21) fo(@2, yo, 22)0120/O21 


We require to represent it in the form >>, Viz ® Vox ® V3x © Vaz. Let e(k) denote 
the N x 1 column vector with a one in the k'” position and zeros at all the other 
positions where k = 1,2,...,N. Likewise, let f(k) denote the 2 x 1 vector with 
a one in the k*” position and zero at the other position k = 1,2. Then —i0/021 


is represented by an N° x N? matrix that takes the vector 
N 


= SY > dre, ry, mz)e(me) ® e(miy) ® e(miz) € Hr = C% 


Nia Ny Miz=1 


to the vector 


Dp, 2? = ye, A" (O(Mietl, My, M1z)—O(Nre; Ny, Mz) )e(M1z)Be(M1y)@e(n1z) 


Nix sy MN1z 


= So le(mie) @ ery) @ e(m1z)].[(e(m1z + 1) — e(m1z) ® e(M1y) @ e(m12)]7S 
because, we obviously have 
[e(m1z) ® e(r1y) ® e(ni2)|' d = o(me, Wig Wiz) 


In other words, Dj, is the N® x N° matrix given by 


Dypex=A* YP [e(mrz) @e(my)@e(m1z)]-[(e(m1e+1)—€(n1z))@e(Mry)Be(n12)|" 


Nia Ny N12 


Multiplication by f1 (x1, 41, 21) is represented by the N° x N® diagonal matrix 


N 
Drp= Yo Alte, myA,nizA)[e(mz)®e(n1y)Be(n12)]-[e(miz)@e(M1y)Be(n12)]" 


Ney M1iz=1 
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Likewise f2(x2, y2, z2) is represented by another N* x N® diagonal matrix Dy,. 
Hence, 


fila, Y1, 21) fo(xa, yo, 22)PiaO 1x = fil, Y1; 21) Pix f2(Xa, yo, 22)O1n 


will then be represented by the 4N® x 4N? matrix 
(Dy, Dp,) ® Dx, ® O12 @ Ia 


which is of the desired form Vi; ® Vaz ® V3~ © Vax. It should be noted that there 
are terms in the above interaction that cannot be expressed in such a completely 
factorized form, for example a term involving 1/|r, — r2|. However such terms 
generally have the form 


Ff (£1, Y15 215 £2, Y2, 22) Pix O12 a (a) 
or 
f (1, 915 215 ©25 Y2, 22) PiyF 12 


or 
f(«1, Y1, 71, 2, Y2, 22)Pie Fly 


etc. and these can be expressed in a restricted factorized form 
Vier @ Var @ Var 


where Via, is N® x N® and acts in H, ® He while V3x, Viz are both 2 x 2 and 
each act in C*. In the above case (a), 


N 
Van = 3 f(nizA, ..., n2zA)[e(M1e)@e(N1y) @e(n1z)Be(naeBe(nay)@e(n2z)]["|" 


Nie M1y;N1z N22 ,N2yN22=1 


where [/] denotes [e(niz) ® e(niy) ® e(n1z) @ e(nez ® e(nay) ® e(nez)]. When 
such a term is present in the Hamiltonian, it contributes an amount 


< ~|Vi2 ® V3 ® Val > 


to < w|H|q > and for example in case (a), this evaluates to 


C? < (1 @ Yo — v2 @ W1)|Viald1 @ v2 — Ye @ V1 >< +|V3l+ >< +|Val+ > 


Now, 
< (¥1 @ He — Y2 @ 1) |Vialy1 @ 2 — Yo @ Y1 > 
=< 1 @t2|Vio|d1 @Y2 > + < YOu |Vi2|y2@d1 > — < v1 @y2|Vi2|y2@e1 > 
— <2 @ YW |Ni2|41 ® v2 > 


The variational derivative of this w.r.t ~j evaluates to 


(Ins @ W3)Vielh1 ® Yo > +(y3 ® Ins) Vi2|2 ® Yi > 
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—(Ins ® 3) Vi2|Y2 ® Y1 > —(#9 ® Ins) Vi2|¥1 ® We > 


multiplied by the factor < +|V3|+ >< +|V4|+ >. Note that ~3 means conjugate 
transpose of 2 which is therefore a 1 x N® row vector. Iy3@w3 and 3 @ Iys are 
thus N? x N® matrices and since Viz is a N® x N® matrix and |y1 @ y2 > is an 
N®x1 vector, the quantities (Ins OwW5)Vi2ly1@W2 >and (W5 @In3 )Vie|2@ay > 
are N? x 1 column vectors. 

Now suppose we have an equation of the form 


idyp, /dt = Fy (v1, 2), idy2/dt = Fo(1, We) 


as in the above derived Hartree-Fock equations. Here Fi,(w1,W2),k = 1,2 are 
N®? x 1 vector valued functions of 1, v2 € CN°*! and wx, uk € C1”. These 
are solved in MATLAB using a for loop implementation: Represent ~1, 2 as 
N® x K matrices where 7 (:,t), v2(:,t) are the vectors ~1,W2 at time t. K is 
the total number of time samples. Then the for loop iteration will read 


fort=1:Kh 

Wi(,t+1) = dil, t) —t*d * Fi(di(:,t), Pal:, €)); 
Wo(:,t +1) = po(:,t) —¢* b * Fo(di(:, t), Pals, €)); 
wt +1) = d1(:,t + 1)/norm(v1(:,t + 1)); 
Pa(,¢ +1) = deo(:,t + 1)/norm(y2(:,t + 1)); 


[18] Computing the perturbation in the singular values and vectors under 
a small perturbation of a matrix with application to the MUSIC and ESPRIT 


algorithms. 
ASC 


Has an SVD 
A=UDV* 


Let rank(A) =r. Then since 
A*A=VDD"V* 


With DD? diagonal square matrix having exactly r nonzero (positive) elements, 
say Ay, = o7,k =1,2,...,7r, we can write 


A* Aug = Apvy, k = 1,2,...,7, Avy =0,K =r 4+1,...,N 


Also, 
AV =UD 


Gives for the columns wj,...,ugz of U, 
Up = Avz/op, k = 1,2,...,7 


Suppose now that A gets perturbed to A+ 6A. Then, B = A*A will get 
perturbed to B + 6B, where 


6B = A*.6A+6A*.A+6A*.6A 
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= 6,B+4+62B 


Where 
6,B = A*6A+6A*A 


is the first order perturbation in B and 
69B = 6A*.6A 


Is the second order perturbation in B. Now using second order perturbation 
theory, we write for the perturbation in v,z, Given by 6, vz, + doug where 6, vz 
and d69vz are the first and second order perturbation terms, and also for the 
corresponding perturbations 6;A, + 62% in the singular values, 


(B +6,;B+ 62 B) (Up + djvu, + 62Uk) = (Ak + 6yAR + d2AK) (UK + dyup + b2Uz) 
Equating terms of the same order on both sides gives 
Bug = AkVE; 


Boyup + 6, Burg = An 01K + O1AK-UE, 
Bogup + 6,AR6 UR + O2ARUE = AndQUE 


Perturbation theoretic analysis of the SVD based ESPRIT algo- 
rithm 
Step 1: The noiseless array signal model is 


Xo = AS, X, = AGS 
and the noisy array signal model is 
Yo = Xo + Wo, Yi = X1 + Wi 


Here, A is an n x p matrix of full column rank p while S is a px m matrix of full 
row rank p. ® is a p x p diagonal unitary matrix. We also assume that n < m 
and then Yo, ¥; are n x m matrices of full row rank n (This assumption means 
that the number of time samples is much larger than the number of sensors). 
We write the SVD’s of Xo, Yo, X1, Yi as 


Xo = UpDoVg , X1 =U Di VY, 
Yo = OpDoV,¥1 = 0.0.04 
where Up, U; are n x p matrices each with orthonormal columns, Vo, Vi are m x p 
matrices each with orthonormal columns, Do, D1 are p x p diagonal matrices 
with positive diagonal entries, Uo, U, arenxn unitary matrices, Vo, Vi are mxn 


matrices each with orthonormal columns and Dp, D, are nxn diagonal matrices 
with positive diagonal entries. The first p diagonal entries of Dp and D, are 
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small perturbations of the diagonal entries of Do and D, respectively. The last 
n — p diagonal entries of Dp and Dj, are small positive perturbations of zero. 
We can thus write 7 
Do = diag|Do + dDo, dDo1] 
Ds => diag|D, + 6Dy, dD44] 
Uo = [Uo + Uo|Uo1 + 5Uoi), 
Vo = [Vo + 6Vo|Vo1] 
VY, = (Vi + OV |Viil, 
Oy = [U1 + 6U,|U11 + 6U14] 
where 6D 9,6Do1,6D1,6D11,6Vo, Vo1,6Vi, Viz. are computed by standard per- 
turbation theory for Hermitian matrices with the Hermitian matrices Yo Yo and 
Y/Y, being regarded as small perturbations of the Hermitian matrices X¢ Xo 
and Xj Xj, respectively. Note that X;X;,k = 0,1 are m x m positive semidef- 
inite matrices of rank p while Y;Y;,,k = 0,1 are m x m positive semidefinite 
matrices of rank n. Note that dVo and 6D and likewise, dV; and 6D, are com- 
puted using standard non-degenerate perturbation theory for Hermitian matri- 
ces while Vo1,6Do1, Viz and 6D,; are computed using degenerate perturbation 
theory for Hermitian matrices based on the secular matrix theory as described 
in Dirac’s book, ” The principles of quantum mechanics”. Then, 


UoDo = XoVo,Uo = XoVoDo", 

U,D, = X1Vi,,U; = XV Dy" 
Thus, since 0X9 = Yo — Xo = Wo, 6X1 = Y, — X1 = Wi, we get 
6Uy = WoVoD5* + X06VoD5* — XoVoD575Do 
dU, = WV, Dy! + X16V..Dy* — XV Dy?.6D, 


using first order perturbation theory. The diagonal entries of ® are derived as 
the rank reducing numbers (rrn) of the matrix pencil 


Now, 
Xi = Xo = U;DiV; = y.Up DoVo 
= Ui (Di — Qi: D0Q3)Vi 


and since U; and V; have full column rank p, the rrn’s of the matrix pencil 
P,(7) can alternatively be obtained as the rrn’s of the pencil 


D1 — 7¥Q1D0Q3 


where 
Qi = U{Uo, Q2 = Vi'Vo 
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are p X p matrices. These rrn’s are therefore equivalently the solutions of the 
p'’-degree polynomial equation 


det(D, — ¥Q1DoQ3) = 0 


Now let yz be one of these rrn’s, ie, it is one of the diagonal entries of ®. Then 
there exist exactly two vectors €, and 7, upto a constant of proportionality such 
that 

& (Di — YeQ1D0Q2) = 0, (D1 — yeQ1 DoQ2)n = 0 


These equations can be expressed as 
£,U} (Ui DiVy — yxU0DoVo )Vi = 0, 


Ut (Ui Di Vi — yxUoDoVo')Vink = 0 


since 
U{U, =VFV, = I, 


or equivalently, 
tt A(® — 7¢-Ip)SVi = 0, UZ A(S — YeIp)Syx = 0— — — (1) 


where 
rE = UiEk, Yr = Vine 


Now since 


R(A) = R(Uo) = RU), R(S*) = R(Vo) = R(V1) 


and the second implies 
N(8) = R(Vi)+ 


while the first implies 
R(U,)~ = N(A*), 


it follows that (1) implies and is in fact equivalent to 
r,A(® — ¥1,)S = 0—— — (2a) 


and 
A(® — y¥Jp)Sy~ = 0 — — — (20) 


or equivalently, since S has full row rank and A has full column rank, 
xr, A(® > Yelp) = 0,(® — Yelp) SYK =0 


Moreover, 


zr € R(U;) = R(A) = N(A*)+, ye € R(Vi) = R(S*) = N(S)+ 


it follows that 
rA £0, Syn #0 
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and hence, (2) when combined with the fact that only the k“” diagonal entry of 
® — 71, vanishes that 


Lp F 0, 2,4; = 0,7 F k, 


SL Yk # 0,57 Yr = 0,3 #- k 


where oe k =1,2,...,p are the rows of S. Thus, 
at Xo = E4AS = xiazsi, 


Xoyr = ASy~ = AkSE Yk 


Note that az,,k = 1,2,...,p are the columns of A while Ean k =1,2,...,p are the 
rows of $. This gives us the identity that s/ is proportional to xjXo and az, is 
proportional to Xoyz. Note further that the above identities imply 


(xan) (sk Ye) = TeXOYR 


Now when noise perturbs the data matrices, we have to consider the matrix 
pencil 


in place of its unperturbed version P(y). This is a matrix of size n x m and 
will generally have rank n but when y assumes one of its rrn values, the rank of 
this matrix pencil drops to n — 1. This is in contrast to the noiseless case where 
where the rank of P(y) = X, — y.Xo (which is again a matrix of size n x m) 
drops from p to p— 1 when y assumes one of the rrn values. Now, 


P(y) — U,DiV;" oa +.UoDoV3 
= U1 (Di = 7-Q1Do.Q3)V;" 
where 7 ae Tae 
Q1 = U{Uo, Q2 = Vi'Vo 
Dp is an n x n diagonal matrix such that its first p diagonal entries are large, 
these being the corresponding perturbations of those of Do while its last n — p 
diagonal entries are small, these being perturbations of the zero singular values 


of Xo. We note that since U; and V; have full column ranks n, it follows that 
the rrn’s of P(y) are same as those of the n x n matrix 


dD -_ 7.Q1-Do.Q3 
Note that 7 a : a 
Uo = YoVo.-Do*,U1 =%iVi-Dz" 


Note also that Ot, Os are n X n non-singular matrices. These rrn’s are obtained 
by solving - 2 
det(D, — y.Q1-Do.Q3) = 0— — — (3) 
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p of these solutions will generally be close to the diagonal elements of ® (ie, the 
rrn’s for the unperturbed case) while the remaining n — p solutions will be close 
to zero. Now, we arrange our SVD so that the first p largest diagonal values 
of Do appear before the remaining n — p diagonal values. Then, we have as 
discussed above, 

Do = diag|Do + bDo|6Do1] 


and we are interested in evaluating not the solutions of (3), but rather the 
solutions of the determinantal equation obtained by considering the top p x p 
left hand corner block of 


dD, ae 7.Q1-Do.Q3 
The top left hand corner block of D, is D, + 6D, while the top left hand corner 
block of Q1.Do0.Q5 is 


(Q1.Do.Q3)11 = (Q1)11(Do + 5Do)(Q2)11)* + (Q1)126Do1((Q2)12)* 


where we have used the partition 


Next observe that using first order perturbation theory, 
Qi = {Uy = 
[Uy + 6U4|Ui1 + 6U11]* [Uo + 6U0|Uo1 + 6U 01] 


implies that 7 
(Qi)u = 


(U, + 6U1)* (Uo + 6U9) = USUp + UZ6Uo + 6UF Up 
= Q1 + UF 5Uy + 6UFUp 
(Q1)12 = (U1 +6U1)* (Vor +6U01) = UF U1 +U¥6U01 +5U¥ Uo = UX 5U91 +6UF U1 
Qo = ViVo = 
[Vi + 6Vi|Vii}* [Vo + 5Vo| Vor] 
implies that 


(Qa)11 = Vii'Vo + ViOVo + OVVo = 
Qo + VirdVo + 5VVo 


(Q2)12 = Vi Vor + OV,‘ Vor = 6Vi*Vo1 


and, thus, ee ue 
(Oi.Do.Q5 ir = 


Q1DoQ2 + Q16D0.Q5 
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upto first order perturbation terms. Thus upto first order perturbation theory, 
the estimated values 4;,k = 1, 2,...,p of the true rrn’s yp, k = 1,2,...,p are the 
solutions of 

det(D, + 6D, = 4.Q1(Do + 6Do)Q3) =0 


and associated left and right eigenvectors corresponding to the estimated rrn’s 
4% are obtained by solving 


EF (D1 + 6D1 — Yn-Qi(Do + 5Do)Q3) = 0, 
(D, + 6D, — 4.Q1(Do + 6D0)Q3) ix = 0 


Remarks: 


[al 
[Vo|Vo1] 


is a matrix of size m x n having orthonormal columns. Vo, is determined by 
the eigenvectors of the ’secular perturbation matrix” corresponding to the zero 
(unperturbed) eigenvalue of the m x m matrix X¢ Xo and its perturbation 


5(X*Xo) = XSW + WeEXo 


The secular matrix corresponding to this is the m x m matrix of this perturbing 
operator w.r.t an onb for the zero eigenvalue subspace of X}Xo. This secular 
matrix is therefore of size m — px m—p. The n—p columns of Vo; are thus 
orthonormal and form a subspace of the space spanned by the m—p eigenvectors 
of X§-Xo corresponding to its zero eigenvalue. Likewise, the n — p columns of 
Vi, are also orthonormal and form a subspace of the space spanned by the m—p 
eigenvectors of X9 Xo corresponding to its zero eigenvalue. This latter m — p 
dimensional space is precisely N(X}Xo0) = N(Xo) = N(S) = N(X}$X1) = 
N(X1). Equivalently, R(Vo) = R(Vi) = R(S*) and N(S) = R(S*)+ contains 
R(Vo1) as well as R(Viz). R([Vo + 6Vo|Voi]) = R(Yo') = R(Vo'Yo) andR([V, + 
6Vi|Vii]) = R(Y;*) = R(VY/Y1) and these two subspaces clearly have dimension 
n. R(Vo) = R(V,) = R(S*) = N(S)+ is in particular orthogonal to R(Vo1) as 
well to R(V,,).In particular, V;*Vo1 = 0 


[b] Similar remarks apply to U in place of V. The corresponding relations 
are obtained using Xj = S*A*, Xf = S*®*A* Yo = X65 +Wo, Yi =xXi+We 
in place of Xo,%1,Yo,¥i. In particular, U{Up; = 0. This could also be seen 
directly using 


[Uo + 6U9|Uo1]diag[Do + 5Do, 6Do1] = (Xo + Wo)[Vo + 6Vo|Vor] 
which implies using first order perturbation theory that 
UpDo = XoVo, 


Ud Do + dU0Do = X06Vo + WoVo, 
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XoVo1 = 0 
Uo15Do1 = WoVo1 


Thus in particular, 
Uo1 = WoVo1(6Do1)~* 


These equations do not appear to imply U{Uo1 = 0. However, let us see the 
contribution of this term to 


(Q1)126Do1((Q2)12)" 
It is given by a 
U}U015 Doi ((Q2)12)* = 
Ut WoVo1((Q2)12)* 
= US WoVo1 6 Vi Vor 


which is of the second order of smallness in perturbation theory and hence 
can be neglected. In fact, this could directly be inferred from the fact that 
(Q1)120Do1((Q2)12)* is of the second order of smallness since ((Q2)i2)* = 
dV;*Vo1 is of the second order of smallness. 


[19] Problem on video-conferencing (Suggested to me by Prof.Vijyant Agar- 
wal). There are N speakers numbered 1,2,...,N conversing over a common 
line. Let x; denote the speech vector signal spoken by the k*” speaker. Assume 
that the listener receives a superposition of compressed versions of the different 
speakers. For example, if x;,(t),t = 1,2,...,M are the speech samples of the k*” 
speaker, then his dominant wavelet coefficients are 


N 
cr (n,m) = S°rn(t)nm(t), (n,m) € Dr 


where D, is a small index pair set compared with the original number N of 
time samples of the signal. This equation can be expressed in the form 


Cy = AX, 


where 
Ax = ((nm(t))) (njm)e Det {1,2,....N} 


Let (D;,) denote the number of elements in D;,. Then 
Ax € REPWXN 
and since we are assuming that 


(Dr) << N 
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it means that A; has very few rows compared to the number of its columns. 
Thus c, € R“*) is a wavelet compressed version of x, € R%. The listener 
receives the superposition 


The listener also has some idea of what the original speech x; of the k*” speaker 
is for each k. This approximate signal is wz. He also wishes his estimate of each 
speaker’s speech to have a small energy. Thus, taking all these considerations 
into account, the listener must minimize 


N N 
E(x,,k = 1,2,..., = 5° B(k) || Xx—Wwe eve ) I xx 1? +A(e—S5 ak) Anxn) 
k=1 k=1 k=1 


where A is a Lagrange multiplier used to incorporate the constraint of the com- 
pressed superposition of the speech signals spoken by the different speakers. EF 
must first be minimized w.r.t x,,k = 1,2,...,N,A, then a performance analysis 
must be carried out along the following lines: 


[a] Assume that w, = x, +6x,x,k =1,2,...,N where the dx},s are small, say 
these are bounded in norm by ex: 


|| xp ||< ee, k =1,2,...,N 
Then derive upper bounds on 


Il xz — Xx || 


[b] Assume that the the a(k), 8(k), y(k) are not precisely known due to sys- 
tem uncertainties, ie, although we use {a(k), 6(k), y(k)} during the minimiza- 
tion process, while implementing, the estimator, we use a(k) + da(k), 8(k) + 
dB(k), y(k) + 6y(k). Then how will the values of || x; — xX, || change ? 


Setting the partial derivatives of EF w.r.t x, at the estimate x, and w.r.t. A 
to zero then gives 


26(k) (Xk — Wh) + 27(k)Xe — a(k)AZA = 0, 


Thus, 
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and then 


(he)? e a(h) B(K 
>= Date + amy AE le DL Beye im EH 
and thus, 
Xp = 
Blk). a(k) a(m)? 4 
CORSO MMAAC OEE T Ma CET) ae 
(mn) B(m) 
2 Bony my an 
= b(k)we + a(k) ALLS) e(m)AmAn| |. le — S$ > d(m)AmWm! 
i (i) B(k) 
a") = BR) +e) = B+ 
__a(h)B(E) yy at? 
4) = Beye ace) = 208) + 4) 
Statistical performance analysis: Let 
Wr = Xp + OXE 
where dxz,k = 1,2,...,N are random vectors assumed to be small random 


perturbations of the true speech signal vectors of the speakers. We also assume 
that c = >>, a(k)Axx, is known to the listener. We then obtain using the 
above formula 


[20] Application of supersymmetry to the design of quantum unitary gates. 

[a] What is meant by a supersymmetric theory of elementary particles ? 

Syllabus for End-Semester Examination 

EC-SPC03 

[1] Axioms of probability theory, probability spaces, random variables, ex- 
pectation, properties of the expectation map, Conditional expectation w.r.t 
a sub-sigma algebra derivation using Radon-Nikodym derivatives and another 
derivation using orthogonal projections in Hilbert space and density of L?(P) 
in L'(P). 

[2] Chebyshev’s inequality, Borel-Cantelli lemmas, definition of convergence 
of a sequence of random variables in distribution, in probability, in mean square, 
almost surely (ie, with probability 1). Proof that convergence in mean square 
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implies convergence in probability, convergence almost surely implies conver- 
gence in probability implies convergence in distribution. L? convergence and 
Holder’s inequality. 


[3] Filtration on a probability space, martingales, submartingale and su- 
permartinagles w.r.t a filtration, Doob’s maximal inequality for martingales, 
the martingale upcrossing and downcrossing inequality for martingales, Kol- 
mogorov’s inequality for sums of independent random variables as a special 
case of Doob’s maximal inequality, Doob’s sub-martingale convergence theorem 
using upcrossing inequality for submartingales. 


[4] Weak and strong laws of large numbers. Proof of the weak law of 
large numbers using Chebyshev’s inequality, proof of the strong law using Kol- 
mogorov’s inequality for sums of independent random variables. 


[5a] Bayesian binary and multiple hypothesis testing, Neyman-Pearson test, 
the likelihood ratio test, binary hypothesis testing for sequences of iid random 
variables, the asymptotic rate at which the optimal probability of miss converges 
to zero as the relative entropy between the two probability distribution given 
that the false alarm probability converges to zero (Stein’s theorem) (proof based 
on the large deviation principle) 


[5b] Parameter estimation based on measured data using Maximum likeli- 
hood, Maximum aposteriori (MAP) and Minimum mean square error (MMSE) 
criteria. Asymptotic properties of the maximum likelihood estimator when the 
measurement is a sequence of iid random variables. The Cramer-Rao lower 
bound on the variance of unbiased and biased parameter estimators. 


[6] Stationarity and ergodicity of a discrete time stochastic process. Condi- 
tions for mean and correlation ergodicity of a stochastic process with application 
to Gaussian processes. Application of the ergodic theorem to estimation theory. 


[7a] Estimation of parameters in linear models, linear prediction theory for 
wide sense stationary processes. The optimum causal Wiener filter derivation 
using Wiener’s spectral factorization method and using Kolmogorov’s innovation 
process theory. 


[7b] Statistical performance analysis of parameter estimation in linear models 
based on data matrix perturbation theory. 


[8] The Levinson-Durbin algorithm for order-recursive linear prediction for 
WSS processes with known statistics. Derivation based on Toeplitz and cen- 
trosymmetric properties of correlation matrices. Forward and backward predic- 
tion filter update formulas. Block diagrammatic representation of lattice filter. 
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Order recursive construction of joint process filter using the backward prediction 
filters and orthogonality of the backward prediction errors of different orders. 


9| The projection operator update formula. Another derivation of the 
Levinson-Durbin algorithm based on this formula. 


10] The RLS algorithm for time recursive prediction and filtering for time 
series having unknown statistical correlations. 


11] The RLS-Lattice algorithm for joint time and order recursive prediction. 


12] The Levinson-Durbin, RLS and RLS lattice filters for multivariate time 
series. 


13] Brownian motion and white Gaussian noise. Einstein’s derivation of 
the diffusion equation for the pdf of Brownian motion, Einstein’s evaluation of 
the diffusion constant of Brownian motion in terms of viscosity, temperature, 
Boltzmann’s constant and the radius of the pollen particle executing Brownian 
motion. White noise as the formal derivative of Brownian motion, correlation 
properties of Brownian motion and white noise, Ito’s formula for Brownian 
motion and the Levy oscillation property. 


[14] Independent increment processes, Brownian motion and Poisson pro- 
cesses as special case. The characteristic functional of a superposition of Brow- 
nian motion and independent Poisson processes. 


[15] Adapted processes and stoptimes w.r.t to a filtration. Doob’s optional 
sampling theorem for martingales, exponential martingales for Brownian mo- 
tion, application of the optional sampling theorem for martingales to computing 
the statistics of the first hitting time of Brownian motion at a given level. 


[16] Construction of the Ito stochastic integral w.r.t Brownian motion for L?- 
adapted processes. Properties or the Ito stochastic integral (The Ito stochastic 
integral as an isomoprhism between Hilbert spaces). 


[17] The Ito stochastic differential equation, proof of existence and unique- 
ness of the solutions using the Lipschitz conditions on the drift and diffusion 
coefficients and Doob’s martingale inequality, derivation of the forward and 
Backward Fokker-Planck-Kolmogorov equations for the transition probabilty 
density of a diffusion process using Ito’s formula. 


[18] The periodogram power spectral density estimator for a stationary dis- 
crete time Gaussian process. Properties of asymptotic mean and covariance of 
the periodogram. Inconsistency of the periodogram. Improvement on the vari- 
ance of the periodogram estimator using smoothening windows applied to the 
data. 
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[19] MUSIC and ESPRIT as high resolution eigensubspace based estimators 
of frequencies and directions of arrival for plane wave sources incident upon 
an array of sensors. Finite data performance analysis of the MUSIC and ES- 
PRIT estimators based on matrix perturbation theory, SVD based MUSIC and 
ESPRIT algorithm using data matrices, performance analysis of SVD based 
algorithms. 


[20] Kushner-Kallianpur filter, Extended Kalman filter and Belavkin quan- 
tum filter for estimating the state of a process with noisy measurements. 


[21] Application of the Belavkin filter to estimating the spin of the electron 
interacting with a magnetic field and subject to quantum noise based on the 
Hudson-Parthasarathy-noisy Schrodinger equation. 


[22] The energy of an electromagnetic field within a cavity resonator, quan- 
tization of this energy using field creation and annihilation operators, cavity 
field interacting with the bath noisy field, Application of the Belavkin filter for 
estimating the state of the cavity field. 


[23] The cavity electromagnetic field along with the cavity Dirac electron- 
positron field in interacting with the bath field. Application of the Belavkin filter 
to estimating the cavity electromagnetic field and the cavity electron-positron 


field from non-demolition bath measurements. 


[24] Description of the bath electron-positron field based on Fermionic cre- 
ation and annihilation processes. 


[25] Solving Dirac’s relativistic wave equation for an electron in a radial po- 
tential. Perturbation of the time dependent Dirac equation by electromagnetic 
quantum noise. 

26] Group theoretic statistical image processing. 


a] Properties of the 3-dimensional rotation group and its Lie algebra. 


b] Properties of the d-dimensional Euclidean motion group (rotations and 
translations of R*) and its Lie algebra. 


c] Computing the Haar measure on the rotation group. 


d] Irreducible representations of a compact group and the Schur orthogo- 
nality relations, character of a representation. Representations of a Lie algebra 
and its relation to the representation of the corresponding group. 


[e] The Peter-Weyl theorem for compact groups. Proof based on the spectral 
theory of compact Hermitian operators in a Hilbert space. 
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[f] The irreducible representations of the 3-D rotation group in terms of 
spherical harmonic functions. Proof of the construction based on properties of 
the angular momentum operators. 


[g] Application of the group representation theoretic Fourier transform to 
estimate the scale, rotation and translation of a 3-D image field in the presence 
of blurring noise. 


[h] Statistical performance analysis of the rotation estimation algorithm. 


[i] The Frobenius-Mackey theory of induced representations of a semidirect 
product applied to the construction of the irreducible representations of the 3-D 
Euclidean motion group of rotations and translations. 


(l] Determining invariants of a group action on image fields using characters 
with application to pattern classification/invariant feature extraction. 


[m] The irreducible representations of SZ(2,C) with application to deter- 
mining the irreducible representations of the Lorentz group:Principal series and 
supplimentary series of Gelfand. 


[n] The irreducible representations of SL(2,R):Principal series and the dis- 
crete series of HarishChandra. Relationship of SZ(2,R) to planar Lorentz trans- 
formations ie, transformations in the t — y — z space. 


[o] The Frobenius- Young theory of irreducible representations of the permu- 
tation group. Frobenius character formula of the permutation group in terms of 
generating functions. Applications of the generating function of Frobenius char- 
acter formula to determining invariant features of the radiation field produced 
by a set of point objects under a permutation. Estimating the permutation 
element from the signal collected from the radiation patterns generated by an 
array of point sources and by a permutation of this array using permutation 
group representations. 


[27] Short notes on supersymmetric field theories. 

[a] What are Majorana Fermionic anticommuting variables 9 = (0% : a = 
1, 2,3,4)? 

[b] A superfield S(a,6) is an arbitrary smooth function of the four Bosonic 
space-time coordinates x = (a : w = 0,1,2,3) and the four Fermionic coor- 
dinates @. It can be expressed as a fourth degree polyonmial in the Fermionic 
variables with coefficients being functions of the Bosonic variables. 

[c] What are supersymmetry generators {Qa,Qa : a@ = 0,1,2,3}? They 
satisfy the anticommutation relations 


{Qa, Qo} = ae a? 
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where P,, is the Bosonic four momentum. Here 
Q=Q77°6,7° = [b, -—h],e= diaglioy, —ioy| 
A realization of supersymmetry generators is provided by super-vector fields 
La = (y"0)a0/Ox" + (7° €)ap-0/00° 


and 
La = (y°e)anLp 


Show using the Fermionic anticommutation analog of the Bosonic commutation 
relations 


[0/Ox", x”} = oF, 


namely, 
{0/00%, 0°} = 6° 
that 
{Lashe}=7,0/02" 
Note that 


P, = i0/dx" 


and hence the operators {L,, La} provide us with a representation of the super- 
symmetric Lie algebra generated by {Jyv,Pu,Qa,Qa} where J, are the four 
angular momentum operators 


Juv = EpPy — fyPy 


Note that {J,,,P,} generate the Poincare Lie algebra consisting of Lorentz 
transformations and space-time translations. So here by incorporating Fermionic 
operators into this Lie algebra, we obtain the super-Poincare Lie algebra. 


Supersymmetric current and its conservation: Let S(x,@) be a superfield and 
let ¥m(x),m = 1,2,... denote its component fields. Let £ be a supersymmetric 
Lagrangian constructed from these component fields. Under an infinitesimal 
supersymmetry transformation, £ changes by a total four space-time divergence, 
ie 


+ 


5L = d,J" 


We can also associate a Noether current N“ associated with with the change in 
the Lagrangian under this infinitesimal supersymmetric transformation of the 
component fields: 


The Noether current is conserved when the field equations are satisfied provided 
that the Lagrangian is invariant under the supersymetry transformation follows 
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immediately by making use of the Euler-Lagrange equations. We have in fact, 
from the Euler-Lagrange equations, 


aL aL 
P Oni TONE 


and therefore, 


OpN" = (OL/Ox%m)OXm + ee =0d£ 

OXm,u 
provided that the equations of motion are satisfied. This is zero only when 
the Lagrangian is invariant under a supersymmetry transformation. However, 
in general, under a supersymmetry transformation as we noted above, the La- 
grangian changes by a four divergence and is not invariant. In other words, only 
the action integral is supersymmetry invariant. Thus, in general, we can only 
write 


O,N" = 6L 


provided that the equations of motion are satisfied. It follows that the difference 
of the two equations gives a conservation law: 


d,(S" — NY) =0 


in the general case when the equations of motion are satisfied. 

Remark: 0,,J" = 6L is always true while 0,,N” = 6L is true only when the 
equations of motion are satisfied. Therefore, in particular, both the equations 
are valid when the equations of motion are satisfied. 


Left and right superderivatives: 


0, = (1+7°)0/2,0R = (1—7°)0/2 


Then, 
D = "0.04 — y°€0o 
Dr = (1+°)0/2 = Y#OR0, — €06, 
Dr = (1—7°)0/2 = y*010, + €Oon 
Note that 


Pat = —yHa? 


Further, it is clear that 
(1—7°).Dr =0,(1+7°)Dr =0 


and hence only two of the D‘.s are linearly independent and likewise only two 
of the D’ps are independent. Further any two of the D{,s anticommute and also 
any two of the D’ps anticommute: 


{Dr, Dz} > {Dr, Dr} =0 
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because any two 6s anticommute, any two 05,8 anti commute and @R anti- 
commutes with 09, and likewise, 6, anticommutes with 09,. Since any two of 
the D’ps anticommute and since there are only two linearly independent Ds, 
it follows easily that the product of any three or more D’ps is zero and likewise, 
the product of any three or more D/-s is also zero. 
Further, 

{01,06} = {80,07} = (1+7°)/2, 

and likewise 
1 T 

{9r, 06, } = {06n; Op} = (1 aa +) /2, 

It follows that 


{Di, Dr} = {7"OROn — €O0,; (y"OL0, + on)” } 
= [-y#(1 — y)e/2 — €(1 + 9°) 7"7 /2]8, 
= [-y"* (1 — y°)e/2 — ye(1 — 7°) /2]8, 
= —7"(1— 7° )eOy 
This equation is the same as 
{Dra, Dro} = —[y"(1 — 7°) lan Ou 
which is equivalent to 
{Dra, Dio} = —[y"(1 — 7°) loa Dy 


or equivalently, 
{Dr, Dt} = -b*(1— 7°)", 
=e(1— Py? Oy 
= ey" *(L4+-7°)O, = He(1 + 7°), 


A left Chiral superfield is by definition any function of @z and 
rh = at + ORey"Or 


We prove that a superfield ® is left Chiral iff Drp® = 0. The necessity part will 
follow if we can show that Dr, = 0 and Dex! = 0. But 


Dpér = [y"OLOn + €00 p |OL —0 


since 
OnOL = Oanddo, Or =0 


Also, 
Dre, = [y"O10, + €Oogl(2” + ORey” Or) 
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= OL + Oop (Oney" OL 
=VOr+ ey Or =0 


since 


e = diagle, e],e = io” 


implies 


To prove the converse, define 
at = ot — Obey" Or 


Then, is clear that any superfield can be expressed as a function of 0,,0R, c.anda"t. 
So it suffices to prove that 
DrOr, Dra 


are no-zero. This follows at once. 
Superfield equations: We know that if S(a,0) is a superfield, then under a 


supersymmetry transformation, [S]p changes by a four space-time divergence 
and hence the integral [[S]pd*z is supersymmetry invariant, ie, 


[laus\a'e =o 


where @ = a? 7° with a a Majorana Fermionic parameter. Now it is well known 
that the class of left Chiral superfields is invariant under a supersymmetry 
transformation and so is the class of right Chiral superfields. To see this, we 
need only note that 

L(61)" = (400, + 70) 6% 


= Pe(1+7°)/2=e(1+7°)/2 
which is a constant matrix and further, 
La’, = (y"00, + y°€09)(x” + ORey’ Or) 
= O+ Veo Oney’ Or 
= 70+ Pe((L—)ey’r + (1+ 7° )ev’Or)/2 
= 0+" .ey’ (Oz + OR) 
=0 


thus proving the claim. Note that we have used the facts that 


0=0,+86rR, 


Vv 


(ey’)? = -ey 
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and hence, since 0; and Or anticommute, 


Oey’ OL = O71 ey" OR 


Let ® be left invariant Chiral field. We claim that if [®]7 denotes the 
coefficient of 67 «0, in ®, then [®] changes by a four spatio-temporal divergence 
under a supersymetry transformation. To see this, using the left Chiral property 
of ®,we expand ® as 


®(0,,, 04) = o1(a4) + Of eb2(x4) 


+67 €6,.63(r4) 


Note that product of three or more of the 6/,s is zero and hence the last term is 
07 €0,.03(24) = Of €.63(2) 
Now applying the infinitesimal supersymmetry transformation @L to ® gives 
al y€.L® 


and we find that 
L® = (y"00,, + 7° €.09)® 


and we easily see that the term in this expression that is quadratic in the 6's is 
given by 
9"'0.(07 €b2,n(x))+ 
+7°€.06 (07 €.62,n(r)Opey"OL) 


and it is easily seen that this is a perfect four space-time divergence and in 
particular, the coefficient of 67 0, in a perfect space-time four divergence. This 
proves the claim. Now, if ® is left Chiral, then for any function f of one 
variable, f(®) (to be interpreted as a power series in ® is again left Chiral, 
since it is also a function of 6; and x‘ only and hence [f(®)]r is a candidate 
for a supersymmetric Lagrangian. If K(x, y) is a function of two variables and 
® is left Chiral, then [K(®*,®)|p is also a candidate for a supersymmetric 
Lagrangian. Thus we take for our matter field supersymmetric Lagrangian 


Lu = |K(®*,®)|p + [f(®)|r 


In quantum field theory, the matter fields are composed of scalar Klein-Gordon 
fields, the Dirac electron-positron field and the Yang-Mills matter field the latter 
is an extended version of the Dirac electron-positron field and includes particles 
like nucleons. 


Remark: The construction of conserved currents and the associated sym- 
metry generated by the corresponding charge is familiar even in classical me- 
chanics. For example, suppose L(q, q’) is a Lagrangian which changes by a total 
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time derivative under the infinitesimal symmetry q > q + 6q(q,q’). Then we 
can write 


(OL/0q)6q + (OL/0q')éq' = dF (q,q')/dt 
= (OF /0q)q + (OF/0q')q" 
Here, we are assuming that F’ depends only on q,q’. Now, 
dq’ = (06q/0q)q' + (06q/0q')q" 
and hence equating the coefficients of q’’ on both sides gives 
(OL/0q')(06q/0q') = OF /dq' — — — (1a) 


or equivalently, 


pd6q/0q' = OF /0q' — — — (1b) 
and thus, we also have 
(OL/0q)dq + (OL/0q')(06q/0q)q' 
= (OF /dq)q' — — — (2a) 
or equivalently, 
(OL/0q)5q + p(O5q/0q)q' 
= (OF /0q)q — — — (2b) 
When the Euler-Lagrange equations of motion are satisfied, it then follows that 
d/dt((OL/0q')6q — F) = 
d/dt(OL/0q')dq + (OL/0q')((05q/0q)q' + (05q/0q')q") — dF /dt 
= (OL/0q)6q + (OL/0q')((6q/0q)q' + (06q/0q')q") — dF'/dt = 0 


In fact, this conservation law does not depend on using the fact that F' is a 
function of only q,q’. It can be any function of time. However, suppose we 
introduce the conserved charge 


Q = (OL/0q')6q— F = piq— F 
We just showed that when the equations of motion are satisfied, 
Q’=0 


We now assume that F = F(q,q’). Then Q = Q(q,q’) and therefore Q is an 
observable in the context of classical mechanics. Further, by virtue of (1), 


{Q, a} = {p, q}5q + p{og, a} — {Fg} 


= 6q + pddq/0q' {qq} — (OF /0q'){q', p} 
= 6q + (p06q/0q' — OF /0q'){q', p} 
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This equation is true under all conditions, ie, even when the equations of motion 
are not satisfied. Further, under the conditions that the equations of motion 
are satisfied, 

6d = {Q, a} = {Qa} + {0,7} = {0,7} 
Thus, indeed @ generates the symmetry group of transformations that leave 


the action integral invariant. In the context of fields and supersymmetry, the 
supesymmetric transformation changes £ by a four space-time divergence: 


6L = O,Ju 


Here, the role played by F in the above discussion on particle mechanics is 
played by J” and the role played by dF/dt is played by 0,,J". The role played 
by 

(OL/0q)5q + (OL/0q')oq 


which when the equations of motion are satisfied, equals 
d/dt((OL/0q')6q) 
is played by 
O,,.N® 
In other words, the role played by 


poq = (OL/0q')éq 


is played by N“. Thus the role played by the conserved charge Q = pdq — F is 
played by the supersymmetry current N“ — J". The role played by the Noether 
conservation of charge equation 


Q'=0 


is played by the conservation equation of supersymmetry current 


d,(N# — J) =0 


If @ is a left Chiral superfield, then for any function f of one variable defined 
as a power series, f(®) is also Left Chiral and hence [f(®)] is a supersymmetric 
Lagrangian and so is [6*®]p. A candidate Lagrangian for the matter field is 
then 

Lu =[®"9]p + c[f(®)]r 


Now, ® can be expressed as D?,5 where S some superfield and D?, = DreDr 
with 7 7 
Dr = D1 7°)/2 = D?ye(1 — 7”)/2 


= Dye 
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Recall that since the product of any three D'ps is zero, it follows that D?,S is 
annihilated by Dp and hence is Left Chiral. Now 


[f(®)lr = [f(DRS) le 


combined with the fact that D}, equals Bn €09,, plus terms involving the Bosonic 
derivatives 0,, with constant Fermionic parameters implies (using the power 
series expansion of f that f (D?,8 ) equals DLS. plus perfect Bosonic divergences 
for some superfield S. Note for example that 


(D?2,S)? = D2(S.D2,S) +X 


where X where X is a perfect Bosonic divergence with constant Fermionic coef- 


ficients. Now, [D?,$)| which is the coefficient of 07 <6; in D},S, must coincide 


with a nonzero real constant cz times [S]p plus a total space-time four diver- 
gence. Note that 07'€0;,.07,€0R equals a non-zero real constant times (67 €6)?. 
Recall that [S]p equals the coefficient of (670)? in S plus a perfect space-time 
divergence. Hence, in terms of the Berezin integral, we can write 


[U@led's = [(Drsleate = [(Slot'e = / Sd‘ad‘0 
Now, we can write the total matter action as 
/ (DpS)*(DpS)d*ad*0 + c1 ih [f(DpS)|rd*x 
We recall that from the definition of Majorana Fermionic parameters, 
g* = yey 
where 7°ey° has the block structure 
(<2 0) 
—e 0 
and, we can interpret this equation in terms of 2 x 1 components as 
07 = Or, Op = —eO, 


where 


This is in agreement with the condition 
(97)" = 61, 8p)" =9r 


since 
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provided that we interpret 0; and 0p as 2 x 1 Fermionic vectors. Note that 
e = diagle, e], y’e = diagle, —e] 
we can write 7 
= (6*)7 4° = OF y°¢ 
and hence, 7 7 
6, = 67 y°¢, OR = Oby'€ 
when interpreted as 4 x 1 Fermionic vectors. Also, 
D* =7°e7°D 

so that - 
and hence, 

Dr, = ((1—7°)/2)D* = Vey Di, 

Dz = ((L+7°)/2)D* = 7’e7Dr 


when interpreted as 4 x 1 vector valued super-vector fields. When interpreted 
as 2 x 1 vector valued super vector fields, these equations should be read as 


Dp = —eD,, Di = eDr 


when interpreted as 2 x 1 vector fields, it should be understood that Dz is 


represented by ( ie ) and Dr by - i; Then, 
R 


[(@*Poete = [veda 
= i (D3,8)*(D?,8)d* xd‘ = ‘| ®* D7,Sd* xd‘ 


=- , (D7 0*)Sd'xd*0 
This is because 
&*D2S = &* DkeDRS = 
(—1)?®) DE (@* DS) — (1)? DEO* DRS 
= (-1)?) DE(®*eDpS) + (1)? (eDp)? &* Drs 
= (—1)?® D2(®*eDpS) + DE((eDp)®*.S) 
—(DieDr®*).S 


and 
—(DEeDRO")S => —(De eD,,6)"S 


= —(Dz(Pey’)" ePePDi®)"S 
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= (Dr yeyreyrey’ Dio)" S 
= —(Di7eyD1o)*S = 
= —(D? eD,6)*S = —(D76)*S 
Thus 


9 


if * D?,Sd‘ad*6 = — (D7,®)*Sd*ad*0 
and in exactly the same way, we can show that 


/ [®*5®] pd*x = / [b* D255] pdt 


= / * D7,5Sd‘ad*6 = — / (D7 6)*5 Sd‘ xd‘ 
Likewise consider 
5 [Ut DES\edte = | 5F(D}5)] nate 
Now consider for example 
6(D2,S):=D258, 
5((D2,S)*) = D2,6S.D2,S + (—1)?) D2.S.D2,58 
= D26S.D2,8 + (—1)?)+P@)P@5) 4 D2.65.D2.5 
=2D208.D25 


since 
p(dS) = p(S) 
Likewise, in general, we have 
O(DES)"=D7 05. a(Des)* ” 
and hence, 
Of (DES) = D208. (O28) = D268. f (2) 


Now, 
Drf'(®) = Drf’ (D2) = 0 


since Dr operating on a constant is zero and the product of three D‘ps is zero. 
Thus, 
Dr (58. f'(®)) = DrdS.f'(®), 


D2(58.f"(®)) = D25S.f"(®) 
Thus, 
5 fr@)ra'e = [(D295.1"(®)]ra'x = 
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[OR6S.1@) wae = [185.@)lvdx 


= / f!(®)6Sd4ad*o 


since 6S, S and D285 have the same parity. Thus, our superfield equations 


s| [e*eoaa ter fede =0 
for ® a left Chiral field result in the superfield equations 
—(Di%)* + a f'(®) =0 
or equivalently 
Di DRS = t1 f'(DRS)* 
Note that both sides of this equation are right Chiral superfields. More generally, 


if we replace the term [®*®]p in the Lagrangian by [K(®*, ®)]p with 6 = D?,S, 
then the variation of the corresponding action integral w.r.t S is given by 


i [5K ((D?,8)*, D2,S)/68]D?,6Sd‘xd*0 


=- i [D2,.6K (D7 S*, D2,S)/5®|5Sd*ad*0 


Note that in deriving this equation, we have used the fact that Dr acting on any 
superfield consists of a sum of terms having lesser that four Fermionic parameter 
products and terms that are total Bosonic space-time divergences. These terms 
cancel out when integrated over d*+rd*. So our superfield equations in this case 
generalize to 

—D?,6K (D3? S*, D2,S)/5® + cf’ (D?2,8) =0 


Both sides here are left Chiral superfields. 
Supergravity: 


Let w/" denote the spinor connection of the gravitational field. Then if TP, 


are the Dirac matrices in four dimensions and e/' is the tetrad basis of space 


time being used, the covariant derivative of a spinor field is defined by 
Dub = (On + (1/4)wr "TD inn)e 


where 
DPinn = [Tm 1] 


The curvature tensor in spinor notation is 
Riv = [Ou + (1/4)wr Tm; Op + (1/4)wheT ps] 


= (1/4) Cees — Win JL mn 
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+(1/16)wir"wr* [Pn Prs| 


Vv 


Now using the anticommutator 
{Pins Pn} = 2mn 
we can easily show that 


[eens T ys] = A(msE nr + Nari ms <7 MmrU ns a. Nnsl mr) 


Thus 
Rig = 
= (1/4) (wt — wr) Pmn 


+(1/4)wir wy? (ial ae aa Nnrl ms = NmrU ns _ eval was) 


This can be expressed as 


Rg C/ Vea 


where 
Mr __ mn mn rn ms ms Tn sn rm mr ns 
fis = Wy i Wy i i Wy Wy Nrs + Wy Wy Nsr a Wy, Wy Mer — Wy Wy Mrs 
mn mn rn ms ms rn sn rm mr ns 
Gg Op Tete, ee Oe, a SO WE) 
—s mn mn mr sn mr ns 
= Wr Wan + 2M (wii We =e te, ) 


It is easily shown that when the spinor connection w/” for the gravitational field 
is appropriately chosen so that the Dirac equation in curved space-time remains 
invariant under both diffeormophisms and local Lorentz transformations, then 
the Riemann curvature tensor as defined usually in terms of the Christoffel 
connection symbols, coincides with Ryypo = Riv €mpenc- In particular, R = 
Riv emen 18 the scalar curvature of space-time. The spinor connection w/'" is 
chosen so that the covariant derivative of the tetrad ej; having one spinor index 
and one vector index is zero: 
0O= Dye; = Cia = Vile aE Wr em 

This is an algebraic equation for w/” and is easily solved. However when there 
are spinor fields like the gravitino in addition to the gravitational field specified 
by the tetrad e# (ie, the graviton), then the definition of the spinor connection 
has to be modified and it is expressed in terms of both the graviton and the 
gravitino fields. This equation is obtained by first considering the supergravity 
Lagrangian in four space-time dimensions 


ceR + 1X0"? DL Xp 


where x,, is a Majorana spinor having an additional vector index yp. The gravi- 
ton tetrad field e# is Bosonic while the gravitino x, is Fermionic. These are 
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all considered in the quantum theory to be operator valued fields. Note the 
following self consistent definitions: 


n ron vy _pn 
TY = el Tag = Gul =I eng 
ee 
Ty, = Nm » Ey env = Guv; CF eri =1)nm, 
m v 
En = Mnmey = Gpvln 
rave = pepe 4 Te pew 4 pepay 


where 
PE (Pe 1] 


Thus I”? is obtained by antisymmetrizing the product [T¥I’T? over all the 
three indices. we can also clearly write 


vy _ ypmn vo __ vy pprmnk 
per Seb ee TPM eb er el 
In general, we can define 


geared at S- sgn(a)T¥7!,, [He 
oESy 


This is obtained by totally antisymmetrizing the product [P“1...0"* over all its 
k indices. The basic property of a Majorana Fermionic operator field ~(x) is 
that apart from all its components anticommuting with each other, it has four 
components and satisfies 


where if 


= 
8 
wa’ 
I 
S 
NO 
8 
YS Er wr ~~” 


then 


ux(x)* denoting the operator adjoint of y,(a) in the Fock space on which it 
acts. Also we define 


p(x)” = [v1(x), bo(x), b3(x), da(z)] 


so that we have 


((a)*)7 = [Wr (z)*, Yo(a)*, a(x)", ba(a)*] 


Advanced Probability and Statistics: Remarks and Problems 113 


Now observe that 
€ = diag|io”,io*],0? = ( a ) 
Note that € is a real skewsymmetric matrix. we write 
= oS 0 1 
e=io = ( ir, 
€ = diagle, e],e” = —I 
5 Oo -= e 0 OL 
ae = ae 7) 
= 0 e 
~\ -e 0 


Thus, the condition for ~ to be a Majorana Fermion can be stated as 


Vi.2 a eY)3:4, 3.4 = -eW1:2 


so that 


or equivalently, 


(Win) = —(¥3.4)" e, (H3.4)7 = (Win)"e 


Also, 
OE Toe Pere 


are Hermitian matrices. We observe that if w is a Majorana Fermion, 
b= (pb)? = yl? 
So we can also write down the Lagrangian of the gravitino as 
tXplhY? DL Xp 
Sik ay, 
Styl ek le De 


We can verify that apart from a perfect divergence, this quantity is a Hermitian 
operator field. First observe that 


ns a a 
(Rid Red a ad Rd ee 
ioe dae Reo Rig 
SS operene 
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so that on antisymmetrizing over the three indices, we get 
(ee peee* = _popeve 
Thus, 
1 
(ag, TOL Cay 
ae ie Neel Biase on 
=i hry, 
S20 lr, 
= 8, (-ixt OTH"? x5) 
Aig, Pele Pies, 
proving our claim provided that we replace D, by 0,. If we take the connection 
into account, ie 
DiXp = WXp + 1/4)w? "TD imnXp 
ke 
then it follows that we must prove the skew-Hermitianity of the operator fields 
5 aes as Wana ter crn Sad ——-(a) 
and 
«T WOppvp a 
pi gd ais ore) Bos ec | 


However the field (b) is identically zero since [#”? is antisymmetric in (v, p) while 
Pe, is symmetric in (v, ). Hence, we have to prove only the skew-Hermitianity 
of the field 


x THD mn Xp — — — (c) 
Now, 
Leela = (1/2) (EHP Vn + (/2)4T ml? } 
and 
[Vpar Pinn] = [PDP gr +P Ql rp + TL pg, Pin] 
Now, 


[PpP gr Pn) =U por; Pn] + [Pps PmnlP gr 
= AP (NgnD rm + rml gn — Naml'rn — rn gm) 
a es eer cee fe eee | Deere 
Summing this equation over cyclic permutations of (pqr’) gives us 


[Ppgr; Tmn| = 


AS> tng(U pV nr + Vr'pn + Tal rp) 
(par) 
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+40 ting(Upl'rm +22 mp + PmI' pr) 
(par) 


=4 ys (mal pnr + Nngl prm) 
(par) 


Note that this quantity is antisymmetric w.r.t interchange of (m,n). It thus 
follows that 
by ae, PinnlXp 


= yer Te [Pears Dinn |X” 
—4 Se pa TOT pra +4 tape Pol nx | 
(par) 


Now 
CPP a” ) * 


seh Pipe eee 
= PPE TO pPrrayr 


which proves the Hermitianity of x?*T°D pn-x” and hence of ¥,,[["”?, Pinn| Xp. In 
fact this quantity is identically zero. To see this, we use the Majorana Fermion 
property of y? to write 

DT ae = 


spe €PpnrX” 


and use the fact that 
(Pel yin) Sale 


pnr 
Sel el Sl el pap eT eT pee 
where we have used the identities 
VeSeT Se lr, 
Then from the anticommutativity of the xP *, we get 
PPT ED ie = 
= XPD par) x? = 
— Dall Nara Went cs 
— PET er a = =X FD Dor x 
from which we conclude that 
sO ain =(0 


Now consider 
DG ae Ws Male ied ge 
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where {.,.} denotes anticommutator. We have 


where 


we have 


which shows that 


XX; Xs 


Rp SEP Las 


DG aww ai bad ered asad 


Cie seal red eee 


= <x TD nl YY Xp = —Xo 


X*=-X 


ie, X is skew Hermitian. Note that we have used the fact that 


(Pr kay = 


el Rags Pd 7 i aR 
i Bey Ria ed Ried RF 


igs sg Seen ed Bet 


since MY,,,T,,F°,T° are Hermitian and [? = J. Thus, by antisymmetrizing 
over (pgr) and over (mn), we get 


This proves that 


(Pot ela = Peal tas 


ST Taal gee 


tXpl"”’? Di Xp 


is a Hermitian operator field. 


Now consider the following local supersymmetry transformation 


OX s(t) = Diez), dey, = Kelx)l" xX, (a) 


where e(a) is an infinitesimal Majorana Fermionic parameter. We can easily 
check that D,,€ also satisfies the Majorana Fermion property. Indeed, 


((Dye(x))*)? = Ou(e(x)*)* + (€(2)*) 7 Tinnwpe” 


Now 


(O,e(x))? TeeP? + e(x) PPTs, wo 


Le 


(O,e(x))? TF eF? — e(@) T° ell ae 
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since 
T 
P,e= lx 


Thus, since P°7 = 1, it follows that 
Ped yl Sl nak) = 
aPr ja Pe rer’ 
This gives 
((Dye(x))*)* = 
(Guecr ier el 
te(x)? TT Pel? 
= (Dele) Pel? 


proving thereby the Majorana property of D,,e(x). Now under the local super- 
symmetry transformation of x,, the Gravitino Lagrangian changes by 


bx (Xp? DiXp) = 
= 6x’? DiXp+ 
Xpl’P DiOXp 
= Dye(a)l?"" DiX, 
+X, P4"? Dy Dye(x) 
The term in this quantity that is quadratic in {w/?"} is given by 
Sw fe(ay Te gl el Yrl pa, 


Ay Tere PDs Tee) 


we must first prove that this is Hermitian under the assumption that e(a) and 
Xu(“) are mutually anticommuting Majorana Fermionic fields. Note that we 
can also express this as 


wires fees TOT ly. Cr) 


+xv(x)*P TOT!’ Tinnl rs€(x)| 


Now, 
Pe? SSP ye 
Thus, 
end Bass Drs ee 
Tae eel as 
Thus, 


(a) Pee loree ry) = 
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(eG) Taek eke) = 
er Legler Lael x) 
= xP Pll mne(2) 
= eG al PPO aye oe 
= e(@) ia (De) EE Xe 
Sea Peal nk nex 
= (EQ) PT gl oles 
= —e(x)*TT*, POT! ns Xv 


This proves that (e(x)*7T*,, P°T“”?T.sv) is skew-Hermitian. Note that if we 
replace €(x) by ie(a) where (x) is a Majorana Fermion, then the above quantity 
becomes Hermitian. Consider now the second term. It is 


Teer a eee) 
Sa Orel yn ese a) 
= xp DL’? Pinn|Prs€(2) 
+2 TD ant MPT yp g€(2) 
This is also easily shown to be skew-Hermitian. Indeed, its adjoint is given by 
The general theory of Chiral superfields: The general superfield has the form 


S(x, 0) = C(x) + 07 ew(x) + 07 6M (a) + 07 y°eN (ax) + 07 ey" OV, (2) 


07 0.07 ye(A(x) + a.7’'w p(x) + (07 €6)?(D(a) + 6.0C(2)) 
Under the above mentioned supersymmetry transformation 
a y¢.L = aL 
where 
L= #00, + y’€Oo 
the change in the superfield S' is given by 
6S =aLS 


Here a is a Majorana Fermionic parameter. One can compute the change in the 
component fields C,w,M,N,V,,A,D under this infinitesimal supersymmetry 
transformation and show that D changes by a perfect space-time four diver- 
gence and hence can be used as a candidate Lagrangian. However, in the case 
when the superfield is such that 4 = 0, D = 0,V,, = B.,, the resulting superfield 
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is called Chiral and it is easy to prove that under a supersymmetry transforma- 
tion, a Chiral superfield transforms into a Chiral superfield. The general Chiral 
superfield can be expressed as 


Se(x, 0) = C(x) +67 w(x) +07 6M (x) +07 y°€ON (x) +07 ey" OB (x) + 


abl £0.07 Pey" es (a) +b(6" €6)?D00 («)——~(1) 


Let us first prove that the class of Chiral supefields is supersymmetry invariant. 
To do so, we first note that for the superfield S, above, the change in \ under 
the above infinitesimal supersymmetry transformation, obtained by equating 
the cubic terms in 0, is given by 


67 6.07 y°e(6A + y’ bw yp) 


= 7"0(07 0.M , +O PON, +O ey" O.B uy) 
+a7°€(4b.07 €6.<0)C 


The change in w is given by equating linear terms in @: 
T = 
O° €.6w = ay"OC + 
Ay’ e(2e0.M + 2y°eN 4+ 2cy"0.B,) 
By using the identities 
ey ,e,7"e 
are skewsymmetric, 
007 = (1/4) (07 0.c + OT yPeyPc + 0 ey"0.€7,,) — — — (a) 
and 
Ga we ae 
it is easy to see from the above relations that 
6A =0 


Likewise we can verify that 6D = 0 and the condition that V, = B,, ie, 
V,, is a perfect four gradient also remains invariant under a supersymmetry 
transformation. Indeed, 


(07 €0)?(5D + b.06C) 
= ay"0(07 €0).07 ye(a.y"W pv) 


and 
6C = ay €0907 ew 


— al ew 


together imply that 
d6D=0 


120 Advanced Probability and Statistics: Remarks and Problems 


Note that the product of any two distinct members of the six quantities 
67 0,07 0, 07 ey"0 
is zero and therefore 


0(07 0)0" = (07 O)e/4 


We also use 


Vv Vv 
ey W uv = nt Wy = UW 


Finally, we must verify that V,, = B, changes by an exact four gradient under a 
supersymmetry transformation. To see this we use the equations corresponding 
to quadratic terms in the @: 


67 0.5M +07 7° 0.5N + 0 ey" O5V,, = 


G7" 007 ew, + 7 09 (07 €0.07 77 €(a.y"w 1) 
= at ey" O0" ew, y+ 
a™ (2007 + 67 67°) (ay"wy) 


Noting that the terms 670, 07 y°€0, 07 ey"@ are all the six linearly independent 
quadratic combinations of the 6, we get on equating the coefficients of 67 ey"@ 
on both sides after recalling identity (a) that 


dV, =a yey’ (1/Aey wy 
+(a/2)a7 eeqpy’wr 


which indeed proves that 6V,, is a perfect four gradient thereby completing the 
proof that the class of Chiral fields is closed under supersymmetry transforma- 
tions. 

Remark: Matter fields in non-Abelian quantum field theory get generalized 
in supersymmetry theory to super matter fields which are obtained from the 
D-component of products of left Chiral fields with their complex conjugates 
while gauge fields in quantum field theory get generalized to super-gauge fields 
which are derived from the gauge fields V,,, the gaugino fields A and the aux- 
iliary fields D. In conventional non-Abelian quantum field theory, the matter 
fields transform according to a representation of the gauge group with the group 
element being in general local, ie, a function of the space-time coordinates while 
the gauge fields transform according to the adjoint representation of the gauge 
group plus an additional factor involving space-time gradients of the represen- 
tation of the local gauge group elements. In supersymmetry theory, the matter 
field Lagrangian derived from D-component of quadratic combinations of left 
Chiral fields and their complex conjugates comprises of the scalar field part, 
the Dirac spinor field part and and auxiliary part which is determined in terms 
of the first two parts by setting the variational derivative of the corresponding 
action w.r.t it to zero, ie, it is determined by its field equation which is a purely 
algebraic equation for it. Supersymmetry predicts then that the mass term in 
the Dirac field component Lagrangian contains a mass term that depends on 
the scalar field. 
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Thus an arbitrary Chiral superfield has the form (1) which we repeat here 
for convenience 


Se(a, 0) = C(x) +07 w(x) +07 6M (x) +07 7° €0N (x) +07 ey"OB. (x) + 


ab £8.07 Pees (a) +b(6" 6°00 (e)——~(1) 


Now, 

070 = 07,0 + 07 OL 

07° 0 = OF OR + OF yO, 
= Opp + OL OL 

since 

OneOy = 0, Ony Or =0 
and 

6=0p4+6, =(1—7°)0/24+ (1+ 7°)6/2 

Also since 


Oney"Or = OF ey"O, = 0 


it follows that 
Oey" 0 = 207,€y" 0, 


= 267 ey"OR 
since 0g and 0; anticommute and ey" is skew-symmetric. Further, 
67 6.07 ye = 
(ORR + 07 01) (Op - 67 ye 


= OpOR.O, V+ 0, O1Opye 
Now observe that 
OpO7, = (1/4)(1 — 7°)907 (1+ 7°) 


= (1/16)(1 — 7°) [07 Be + 07 ycOy%€ + 07 ey" Oey, J(1 + 7°) 
= (1/8)(1— 7 ey + 7° )ORey" Oz 
= (1/4)(1— 7° eq ORey" On 


and likewise, 
61.9% = (1/A)L + Vey One" Or 
Thus, using the fact that 
OR(L — 7°) /2 = OR, Of (1 +. 7°)/2 = Or 


we get 
67 6.907 = 


ORO pe" Oey, /2 
+67 7 ey" Ore VY, /2 
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= Oey" OO nyy/2 
ORNL OT Yu /2 
Another way to see this is as follows: 
07 00" = 
(On + 07 )e(O" Be + OT yO? € + OT ey"'0.€y,,)/4 
= (Oz + 07 )e(OneO RE + O07 Ope + OnyORyre + OPP eOrye + 20R€ "OL -€y)/4 


Now, 
On OR a OL Or, 


67° 0, = 04 0, 


and hence 
ORe(O, Ore + OY O17" €) 


= One(O7 Or (1+ 7°)e/2 
= —07(1+7°)0f €6,/2 =0 


and likewise, 
07 €(OneORre + Ony ORY’) 


= -67 0;.0R(1— 7°)/2 =0 
Thus, we get 

6 00" = (1/2) 0B Ohey"Or, — (1/2)02 he y"6r 

Thus, the Chiral superfield (1) can be expressed as 

S.(x, 0) = 
C(x) + (OF w(x) + OF ew(x))+ 
(07 0, + OpeOR)M (ax) 
+(07 0, — 02€0R)N(a) 
+2067", B(x) 
— (4/2) ORC NOL) OL WP ey wv + OR WV EY wy) 

+(1/2)(ORey"Or).(OR€%.81 BIC 


An alternate more convenient formula for the cubic term is obtained as follows: 
We’ve already noted that 


67 66.07 = 
(OLOp + 01 €01)(0n +07) 


= OROR.O, +0, O1.07 
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On the other hand, consider the expression 


A= Opey"O1.07 €.W 


We have 
A167 = ((L+7°)/2)007 (1+ 7°)/2 
= ((1+7°)/2)(07 Be + 07 7°67 €)((1 + 7°)/2)(1/4) 
= 67 (1+ 7°)e0((1 +-7°)/8) 
= OF Or (1+7°)e/4 
Thus, 


A= (1/4)67 61.0 ey" (1+ y?)eewy 
= —(1/2)0F 61.0Rey" wy 
= (1/2)07 0, .Opyeyw py 


Likewise, defining 
B= Oey" OL OREW yn 


= OF ey" ORO Rew p 
we get using 
OROR = ((1— 7°) /2)007 ((1 + 7°)/2) 
= ((1L— 7°) /2)(07 Be + 67 7707) ((1 — 7°) /2)(1/4) 
= ((1 = 15)/8)e.67(1 — 75) 
= ((1—7°)/4)e.OheOn 


and hence, 
B=-O7ey"((1—7)/4)w pOROrR 


= (-1/2)0ROr.OF yew yn 
Thus, the cubic term in the above Chiral field is given by 
ab” 0.07 yey py 
= alOF OLORy ey wp + OROROE YEW) 
= 2a(ORey" OOF ew 
Ope y" OO pew p) 
Combining these two identities, we can express the general Chiral super-field as 
So(x, 0) = 
C(x) + (07 ew(x) + O€w(a))+ 
(07 0, + OpeOr) M(x) 
+(07 0, — 07,€0R) N(x) 
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+202 €7"01 By (2) 
+200 Rey" Or (07, ew, — Onew py) 


+(1/2)(Opey"Oz).(Oney 41 LIC 


Note: 
(Opey"Or)-(ORey OL) 


a Oney" 0107 ey OR 


and 
6, 0F = (1/4)(1 + 7°)0.07 (1 + 7°) 


= (1/4) (07 eBe(1 + 9°)? + 07 7° Oy° (1 + y°)?) 
= (1/2)67 (1+ 7°)0.(1 +7? )e 
= OF Or (1+7°)e 

and hence, 

(Opey"OL)-(ORe YL) 

= (Opey"(1 +7? )e-€y9R)-(O7 OL) 

= 4(0R0R).(07 OL) 

since 7° anticommutes with y“ and y7,, = —2. On the other hand, 
(07 0)? = (07 6, + OneOR)” 
= 2(070p).(07 OL) 
Exercise: Show that the general Chiral superfield S,(x,0) is expressible as 

the sum of a left Chiral superfield and a right Chiral superfield, where a left 
Chiral field is a function of 0; and a!) = «! + Oey" 07, and conversely a right 


Chiral super-field is a function of 0g and a™ “a — ORey"Or. Specifically, any 
left Chiral super-field can be expressed as 


(0,0) = o(a+) + OF ed(w) + OF On F(a) 
where w is a left Chiral field and any right Chiral field can be expressed as 
n(a,0) = d(a_) + Opew(ax_) + OF ORF (x) 


where w is right Chiral. Note that there is no loss of generality in assuming 
that q is left Chiral in the former case and right Chiral in the latter since if w 
is arbitrary, then 

Op eb = OT e((L+ 7°) /2)h 


BREW = ORe((1 — 7°)/2)a 
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since y® commutes with e. Note that 
0, = eV Or, 
0% = yey OL 
It is easy to verify that 
ey 


is a real matrix whose square is the identity and this confirms the requirement 
that 


(97)" = 61, OR)" = 9r 
Note that by the definition of the Majorana Fermion, 
6* = yey 
which gives 
(6, +Or)* = 0% + 0% = yey? (Or + OR) 


from which the desired relationships follow on equating the first two components 
and the last two components. Now, 


(Oney"O1)* 
= 0%)" conj(ey")07, 
= (7°€7"O1)* conj(ey") yey Or 
= OF yey .conj(ey") 77 OR 
= 8 PeP(ey")* yey" On 
since ey“ is skew-symmetric. Now since yey? is Hermitian and y“ and 7° 
anticommute (note that 7° and € commute), it follows that 


Pep (eq)*yPey? 
= (Peyeytyrey)* 

(eyPeytey)* 

Now, 
eyo = 7° 

and further, 

ey?)* = —7%e 

(ey Y 


and 7°7 is Hermitian. Thus, the above equals 
Ck eas 


= Peyyt = ey 
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Thus, we have proved that 
(Opey"OL)* = —OLey"OR 


= —Ohey"O, 


It follows that 


and since 

7, = ey Or 
with y°ey° being non-singular, the conjugate of any function of (@z, 241) is a 
function of (@z, 7") and conversely. Therefore, the conjugate of any left Chiral 
superfield is a right Chiral superfield and conversely. 

Construction of supersymmetric Lagrangians (actions) from Chiral super- 
fields. We shall observe that when we construct the supersymmetric Lagrangian 
as [®*®]p, then we get terms corresponding to the kinetic parts of the Klein- 
Gordon and Dirac Lagrangians while when we construct the supersymmetric 
Lagrangian as [f(®)|]”, we get extra inertial parts for these Lagrqangians. In 
particular, we will observe the remarkable fact that when after constructing the 
total Lagrangian by adding these two components, we write down the field equa- 
tions for the auxiliary fields D, F’, we will be able to eliminate these terms and 
will obtain a broken supersymmetric Lagrangian. In particular, the solution to 
the auxiliary field equations determine the masses of the Dirac particle. In this 
way, supersymmetry is able to explain in a natural way how the scalar field cou- 
ples to the Dirac field giving rise to massive Dirac particles after supersymmetry 
is broken. 

Consider now the Lagrangian 


Ly = [®*®]p 
where 
® = o(x4) + Ofew(+) + OF OL F (2) 


Then, 
G* = G*(x_)+ 0b yey ew" (ax_)+ On eyeyrey ORF" (x) 
= $*(v-)+ ORY Y* (w+ OReORF” (x) 
We shall now evaluate all the terms in [®*®]p: First, 
[o* (w_)O(@4)]p = Ti + Th + Ts 


where 
T, = —6%,6,r(OpeV"61)-(OREY OLD 


= cb", (2) b,r(£) = c10.6" (x).0"9(z) 
Th = (1) 0 pr (x) [Opey" Or -Oney’ OL] 
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= 19" (x)O¢(x) 


where 


= 1f'”I,0) = OO, 


and likewise, 


T3 = c1¢(x)O¢* (2) 


Remark: 
One" 0, Oney’ OL 
= Opey" O07 ey" OR 
A167, = ((L+7°)/2)007 (1+ 7°)/2) 
= be(1+7°)((1 +-7°)/2)e 
= OF Or(1+7°)e 
Thus, 


One" O, Oey’ OL = 
(07 0, )Opey“e(1 + 7° )eey’Or 
= —2(07 0.) One ey" OR 
= ¢(67 0)? nh” 


which follows on expressing 7" in terms of the Pauli spin matrices and using the 
fact that 0; has components 01.2 while 0g has components 63.4 and the product 
of any four of the 6’s is non-zero iff all the 6’s are distinct. Again, 


[ORy ob" (w_)OF ed (x4)]D 
= [w*,(x)" Oey" Ory OROL eb (2) |p 
+[wp (a)? Of ey"O1€6, OR 7°" (2) |b 


Now, 
OnOt = Obey" Orey,(1 + 7°) 
So 
[w*,(2)" Opey"Or7° OR 97 ew(x)|D 


= 2[02 ey" 61 ORE Or] v-b*, (a)? Peqew(x) 
= —2ernt’y*,(2) oy7 V2) 
= 2c nh, (2)y~ (2) 
= —2eyntr of" (x)? 7 Pw(z) 
= —2e1(7"'Y,,(2))*7°v(2) 


where we have used the fact that y,7y° is Hermitian and hence its transpose 
cies is also Hermitian. Note that the conjugate of this quantity is given by 


—2ep(2)* yd y (2) = —2crh(x) yy (2) 
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= —2c1y(xz)* al (x) 


Supercurrent: 
Feynnman superpath integrals and superpropagators. 
The action functional for the left Chiral superfield ® is taken as 


S[®] = [(@*Poete + [U@led's 
Equivalently, writing ® = D?,S we can express this as 
S= : (D?.S*)(D%,8)d*ad*6 + i Sd‘ad‘0 
Note that 7 
f(DpS) = Dp 
for some other superfield S. For example, 
(DRS)° = Da(SDRS), (DRS)* = Da(S(DRS)”) 


etc. So in fact writing 


we get 
f (DRS) = Dk c(k)S(D2S)*4) 


k>1 
So formally, we can write 
f(DpS) = DR (S.(DRS)* f(DR8)) 
The superfield equations are expressible as 
DpDiS* = f'(®) = f' (DRS) 
or equivalently, 
Di, DzS = fx (Di S*) 


since 
f' (DR S)DR5S = Dp f"(DRS)5S) 


and hence 


if Lf’ (D?,8) D268] pd4z = / f' (D?,8)6Sd*ad*0 


By analogy with quantum field theory, it is therefore natural to consider a 
super-Green’s function G(z, |x’, 6’) that satisfies the propagator equation 


D? D2.G(a, 6|2", 6’) — f'*(D?2. G(a, 62’, 0’)*) = 64(a — 2')d4(0 — 6’) 
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Note that by the definition of the Berezin/Fermionic integral, 


/ .0;,d°0 


is zero if either k < 4 or if k > 4 while 
/ 0,026304d°0 = 1 


it follows that 


5*(0 — 0") = (01 — 6) (02 — 05) (03 — 05) (84 — 04) 
This may be explicitly checked by writing out f(@) as 
f(0) =cot+ c1(k) Ox + co(k, M)OKOm + e3(k, Mm, )O.OmOn + 6401020304 


and applying the above Berezin rules taking into account the anitcommutativity 
of the @ and the 6’ to show that 


: 1(0)54(6 — 6")a*o = f(6") 


In the absence of a superpotential f(®), the field equations are 
Dr Des =i 


This equation should be regarded as the super-version of the classical massless 
Klein-Gordon equation or equivalently the wave equation. The corresponding 
super-propagator G should satisfy the super pde 


D? D2.G = Pé*(z — 2')6*(0 — 6") 


where P is the projection onto the space of superfields fields that belong to 
the orthogonal complement of the nullspace of D7 D?,, or equivalently that be- 
long to the range space of D7? D},. This is precisely in analogy with quantum 
electrodynamics. It is easy to see that 


PZ KO D De 


for some real constant K. In fact, we have that P annihilates any vector in the 
range of Dr and hence in the range of D}, and further, we have 


P? = K?0-? D? D2. D? D2, = D?[D?,, D?|D?, 


with 
D2D pg =DieDe Drs 


{Dra Dio} = {(y"OLOn — Y€O0n as (Y” ORO, — 7°€8o, )o} 
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—(7" )ac{Ore Or} (17 )bdO 
—(7 acon. Ra} Ya 
= —[y"((1 + °)/2) (97)? — Pe((1 — °)/2) 97 Jab Ou 
=f" (+ )et+ (1-97) 7" avd, 
= [ye(1+7)]adO. = Xab 


say. Interchanging a and b gives 
{Dra, Dro} = (1+ Vey” avn 


—(y"e(1L — 7°) abn = Xba 


say. In matrix notation, these identities are expressible as 
{Dr, Di} = yWe(1 + 7°)u, 


{Dz, Da} = —y*e(1 — 7°) Oy 


Adding these two equations and noting that Dr anticommutes with itself and 
Dr also anticommutes with itself, we get 


{D, D*} = 24", 
Then, 
DeDra = cD RoDReDia = 
€beDro(Xea — DraD Re) 
= €peXcaD rv — €oeDrvoDiaD Re 
= €cXcaD Rp — €be(Xba — DiaD ro) Dre 
= €peXcaD Rp — €veXbaD Re 
+DiaD} 


Equivalently, 
[Des Dral = €beXcaD Rd a €bcXbaDV Re 


Then, 
DEDiaDiz = ee. eae, Gr _ DpDro) 


—€pc-Xba(Xep =. DrpD Re) + Dat |Dey Drpl oF Drs De) 


so that 
DEDEDE = Gp DaDiaDigDa = 


= (epsilonicX ca€ap Ave — €oeX ba€apX ep) D> 


since the product of any three D/ps is zero. We can express this relationship as 


(DpDi) = 
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ITE Nex \De 


and hence, 
(De Dee = oT ren ex DD, 
Now, 
X= ye(1 +7), 
and hence, 
Tr(e.X.e.X7) = 
—Tr(eqyte(1 +7 )2e2y"" 0,0, 

But, 


arlene rey) 
2Tr(ey".e(1+7°)y"’") 
= —2Tr(ey"7"(1— 7° )e) 
= 2.Tr(y¥y" (1 — 7) = en” 


where c is a real constant. Thus, 


(D? D2)? = c.0.D? D2, 


from which it follows that P = c~'D? D3,/O is idempotent, ie, a projection. 


Supersymmetric gauge theories: The gauge 

Design of quantum unitary gates using supersymmetric field theories: 

Given a Lagrangian for a set of Chiral superfields and gauge superfields, we 
can construct the action as an integral of the Lagrangian over space-time. We 
can include forcing terms in this Lagrangian for example by adding c-number 
control gauge potentials to the quantum gauge field Ve (a) or c-number control 
current terms to the terms involving the Dirac current which couples to the 
gauge field. After adding these c-number control terms, the resulting action 
will no longer be supersymmetric. However, we can still construct the Feynman 
path integral for the resulting action between two states of the field ie, by 
specifying the fields at the two endpoints of a time interval [0,7] and then we 
obtain a transition matrix between these two states of the field. For example, 
the initial state can be a coherent state in which the annihilation component of 
the electromagnetic vector potential has definite values and the Dirac field of 
electrons and positrons is in a Fermionic coherent state where the annihilation 
component of the wave function has definite values. Likewise with the final state. 
Or else, we may specify the initial state to be a state in which there are definite 
numbers of photons, electrons and positrons having definite four momenta and 
spins and so also with the final state. In the case of a supersymmetric theory, 
we’ll have to also specify the states of the other fields like the gaugino field, the 
gravitino field and the auxiliary fields or else we may break the supersymmetry 
by expressing the auxiliary fields in terms of the other superfield components 
using the variational equations of motion and then calculate the the Feynman 
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path integral corresponding to an initial and a final state and then make this 
transition matrix as close as possible to a desired transition matrix by optimizing 
over the c-number control fields. 


[21] One of the main achievements in the work of C.R.Rao was the proof of 
the lower bound on the error covariance matrix of a statistical estimator of a 
vector valued parameter based on vector valued observations using techniques 
of matrix theory. C.R.Rao in his work has also considered the case when the 
Fisher information matrix is singular and in this case, he has been able to 
use the methods of generalized inverses to obtain new formulas for the lower 
bound. The lower bound on the variance of an estimator should be compared 
to the Heisenberg uncertainty principle in quantum mechanics for two non- 
commuting observables. In fact, it can be shown that the Heisenberg uncertainty 
inequality for position and momentum can be derived using the CRLB. The 
CRLB roughly tells us that no matter how much we may try, we can never 
achieve complete accuracy in our estimation process, ie, there is inherently some 
amount of uncertainty about the system that generates a random observation. 


Chapter 3 


Some Study Projects on 
Applied Signal Processing 
with Remarks About Related 
Contributions of Scientists 


[1] Linear models: Time series models like AR, MA, ARMA, casting these mod- 
els in the form 

X(n) = H(n)6+ V(n) 
where X(n), H(n) are data vectors and data matrices. V(n) is noise. H(n) € 
R"*?, X(n),V(n) € R”. If R, = Cov(V(n)) and V(n) are iid zero mean Gaus- 
sian, then the MLE of @ based on data collected upto time n is given by 


O(n) = (D0 A(k) Ry H(k)) (D0 H(n) Ry X(k)) 
k=1 k=1 


Since X(n), H(n) are data matrices, they are also random and we wish to de- 
termine the mean and covariance of 6(n) in terms of the statistics of these data 
matrices. 


[2] Innovations process and its application to the construction of the Wiener 
filter. Stochastic processes as curves in Hilbert space. Let x(n) be a stationary 


process. Its innovations process e(n) can be expressed as 


e(n) = Pane ee = a(n) — Pspan(2(k):k<n)X(n) 
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This means that 
and by inversion, 


so we can write 


L(z) = So U(k)2z-*, H(z) = L(z)"* = So a(k)2*, a(n) = © h(k)e(n—k), h(0) = 1 


k>0 k>0 k>0 


e(n) is clearly a white process. Let 


E(e(n)*) = o¢ 


Then, the power spectral density of x(n) is given by 
Sua(z) = 02 H(z)H(z~*) 


To get a causal, stable H(z) with an inverse that is also causal and stable, we 
assume that S,,.(z) is rational and then select H(z) so that it is causal stable 
and minimum phase, ie, is poles and zeroes all fall within the unit circle. Then 
L(z) = H(z)~+ is also obviously causal and minimum phase. The prediction 
error energy is 


and if I denotes the unit circle, 
Qn)? f In(Sra(2))2tde =¢2 
r 


This can be easily verified using the residue theorem and the fact that H(z) has 
all its poles and zeroes inside the unit circle. 


Some Study Project problems 
Compute the capacitance between two parallel cylinders having different 
radi 


State and prove the Martingale down-crossing inequality and its applica- 
o proving the martingale convergence theorem. 
4] State and prove Doob’s L?-inequality for martingales. 
5| Power spectrum estimation:Compute the mean and covariance of the 
odogram of a stationary Gaussian process. 
6] Apply of Doob’s L?-Martingale inequality to prove the almost sure ex- 
istence and uniqueness of the solutions to Ito’s stochastic differential equation 
when the drift and diffusion coefficients satisfy the Lipshitz conditions. 
7| When the Choi-Kraus-Stinespring operators of a quantum noisy channel 
have classical randomness, then how does one determine the mean square state 
estimation error of the output of the recovery channel when the recovery op- 
erators have been designed in accordance with the Knill-Laflamme theorem for 


3 
1 
i. 
2| State and prove Doob’s maximal inequality for submartingales. 
3 
t 


tion 


peri 
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the mean value of the noisy channel operators in the Choi-Kraus-Stinespring 
representation ? 

[8] Derive the Nonlinear filtering equations for a Markov state when the 
measurement noise is a mixture of a white Gaussian component and a compound 
Poisson component. 

[9] In quantum scattering theory, when the free particle Hamiltonian is Ho 
and the scattering potential is V where V is a Gaussian random Hermitian 
operator, then how does one compute the statistical moments of the scattering 
operator using the well known formula for the moments of a Gaussian random 
vector. 

[10] Application of Cramer’s theorem to computing the optimal rate at which 
the probability of missing the target tends to zero given that the probability of 
false alarm tends to zero with regard to the binary hypothesis testing problem 
for a sequence of iid random variables, ie, under Hi, the data (X1,..., Xn) 
has a pdf of pi(a1)..-p1(@») and under Ho, it has a pdf of po(x1)...po(@n). 
The Neyman-Pearson test for n iid data samples is given as follows: Select 
Hy, if bate} -bittn} > An and select Ho otherwise, where \,, is chosen so that 
Pr(A,|Ho) = Pr(n) > 0 as n > oo and we prove that under this constraint on 
Pr(n), the minimum possible value of lim,n~tlog(Py(n)) is —D(pilpo) where 
Pyq(n) = Pr(Ho|H1) and 


Dipshpo) = f pa(e).tn(pr(2)/po(2))ae 
The proof of this result is based on defining sequence 


E(n) = log(p1(Xn)/Po(Xn)) 


and 

S(n) = (E(1) +... + &(n))/n 
Under any given hypothesis, the €(n)/s are iid r.v’s. The Neyman-Pearson 
test is S(n)/n > n(n) implies select H; and select Hp otherwise where n(n) = 
n—".log(An). The false alarm probability is given by 


Ppr(n) = Pr(S(n)/n > n(n)| Ho) 
and the miss probability is 
Py(n) = Pr(S(n)/n < n(n)|E1) 


These two probabilities are approximately evaluated using Cramer’s theorem. 
We have 


n—‘log(E(exp(s.Sn)|Ho) = logE(exp(s.X1)|Ho) 


= log / (p(2)/p0())*po(@)de = log i pi()"po(a)!~°dex 


and 


n—‘tlog(E(exp(s.Sn)|H1) = tog | pa(x)'**po(a)* ae 
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Thus, for large n, 


n—*.log(Pr(n)) ~ —in fo>nSups(s.x = toy | vs (x)*po(x)'*dz) 


= —sups>0(s.7 — tog fpx(0)*po(2)'*de) 


The minimum that we require is that Pr(n) > 0 and this is guaranteed once 
the above quantity is negative, in the worst case, we may take it to be a negative 
number arbitrarily close to zero. Equivalently, in this worst case situation, the 
supremum above, namely zero is attained when s = 0 and the optimal choice of 
the threshold 7 is obtained by setting the derivative above w.r.t s to be zero at 
s = 0, ie, 


n= i} pr ()log(ps (22)/po(#) de = D(p1[p0) 


For this value of the optimal threshold, we compute the optimal rate at which 
Pyy(n) converges to zero again by applying Cramer’s theorem. 


[11] Basics of queueing theory: 

Let X1, X2,... denote the successive interarrival times of packets in a single 
server queue and let 7\,7>,... denote the service times for packet 1, packet 
2,...etc. Let W,, denote the total waiting time for the n“” packet including the 
service time, ie, W,, is the time taken for the n“” packet starting from his arrival 
time upto the time when his service is completed and he leaves. We then have 
the obvious recursive relationship 


Wrst = max(Sy, +W,- Sn41; 0) + Th41 


where 
Sp = Xi +...+ Xn 


Suppose the X/,s are iid with distribution Fx and the T/s are iid with dis- 
tribution Fi. Then, the probability is to determine the law of the waiting 
time process {W,,,n = 1,2,...}. We also wish to determine the distribution 
of N(t), the number of packets in the queue at time t. We see that the total 
number of departures that have taken place in the duration [0,¢] is given by 
D(t) = maa(n > 1: W,+...+W, < t} and the total number of arrivals that 
have taken place in the duration [0, ¢] is given by A(t) = mar{n >1: S, < t}. 
Then, the size of the queue at time ¢ is given by 


[12] Group representation theory and its application to statistical image 
processing on a curved manifold. 

[1] Definition of the image field model on a manifold M on which a Lie group 
of transformations G acts. 
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[2] Estimation of the group transformation element from the measured image 
field with knowledge of the original noiseless untransformed image field using 
the irreducible representations of the group G. 

[3] by assuming that the estimate of the G-transformation element g is a 
small perturbation of the true transformation, ie, 


g = exp(d.X)go,X €g 


calculate the value of X upto O(6™) and hence determine the probability that 
the error X is larger than a threshold e€ for a given go. 
Some details:The image field model 


f(x) = fo(gg'z) + w(z),c7 EM 


Let Yni(x),! = 1,2,...,d, define an onb for an irreducible unitary representation 
Tp, of G appearing in the decomposition of the representation U in L?(M, y) 
where yz is a G-invariant measure on M and U(g) f(x) = f(g7'.x). Then 


with the additional condition, 
J Foul e)¥in a) d(x) = 5(n,).5(l,m) 
M 


and 
L?(M, p) = Cl(span{¥ ny 2 1 <1 < dy,n > 1}) 


The image field model is then equivalent to 
f(n) = ™7(g)fo(n) + w(n),n > 1 


where 


£(0) = (Fmd) fd) = ff F@)Faula)dute) 
Let go be the true value of g and exp(5.X)go its estimate. Then, we have 


X = argminxes ¥_ on) || £(n) — mm (exp(5-X)go)fo(n) ||? 


where we are assuming that the noise is Gaussian with zero mean and G- 
invariant correlation. 


[13] Statistical theory of fluid turbulence:Partial differential equations for 
the velocity field moments. From homogeneity and isotropy, 


Rij (Piste) =< ori )vpt2) >= Alr) ning + B(r)d;; 
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where r = |r — r2| and # = (r2 —11)/|r2 — 11]. Also 
Cije (71, 72,73) =< ui(11)v;(T2)uR(r3) >= Cije(T2 — 11,73 — 71) 


This third rank tensor must be constructed using scalar functions of |r2 — 
r1|,|r3 —11| and the unit vectors along the directions rg — 1, and r3 — r; and it 
must be symmetric w.r.t the interchange of (rz — 11,7) and (r3 — 71,k). Thus 
the general form of this tensor is given by 


Cizk(11,72,%3) = Ar(|r2 — 11, |r3 — 71|)nengme + Ar(|r3 — r1|, |r2 — ri|)mamyng 
A3(|re — ri, |r3 — 71|)dijMx + Az(|r3 — 11, |r2 — 11 |)dinn; 
+Aa(\r2— ri, |r3 — ri])ninjng + Aa(|r3 — 11], |r2 — 71 |)mamyme 


where n is the unit vector along r2 — 1; and m is the unit vector along rg — '1. 


[4] Estimating the parameters in an ARMA model 
Xy = Hna+Gwnb 
where 
Xn = [x(N),2(N — 1),...,2(0)]7, Wn [w(N), w(N — 1),..., w(0)]” 
Hy = [z7'Xy,..., 2°? Xn],Gy = [Wn, 27 Wy, ..., 27 9Wn] 


a = [a(1), .., a(p)]", b = [b(0),..., b(q)|* 


This defines the ARMA model. In order to estimate a,b from this model, we 
require to compute the pdf of Xy given a,b and then maximize this pdf over 
a, b. 


[5] Statistical properties of parameter estimates in the AR model 
using matrix perturbation theory 


Xv = Hn0+Wn 


where 
Xv = [x(N),a(N as 1), agelO\ 5 
Ay = [2 ie wipe P| 
Wn = [w(N), w(N — 1), ..., w(0)]* 


w(n)’s are assumed to be iid N(0, 07). 
6(N) = (Hi Hn) HEX 


Now P 
NHN An = ((N7'2 *XR2 *Xn))i<ijcp = RB 


NAL X wat Ne XX Se 
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where 
r= ((r(¢))), r@) = R@) = E(a(n — t)a(n)), R= (RG - 3)))ixis<p 
N : N 
(i) = N'Y a(n —1)2(n), RG, 3) = NS ° a(n — a)a(n — J) 
We write 
R=R+6R,f=r+6r 
Clearly, 


(Shor 
O(N) = RolF = (R+6R)"\(r +r) 
x (R7'— R7'6R.R™)(r + 6r) 
=6+R7*ér — R7*6R.0 


so the estimation error is given by 
ey = 6(N) —0 = R-16r — R“15R.0 


Large deviation evaluation of the rate at which ey converges to zero as N > oo. 
Note that by ergodicity, 
6r,d6R > 0,N > co 


Evaluation of the rate at which these converges to zero amounts to evaluating 
the rate at which z[n] = N7! Sed y(n) converges to zero for any stationary 
process y(n). We make use of the Gartner-Ellis theorem to evaluate the LDP 
rate: 


An(a) = N71 log( 


[N])) 
)) 


texp(NA.z 
N 

= N71 log(Eexp(A. S- y(n) 
n=1 


If y(n) is approximately Gaussian with zero mean with autocorrelation R(n), 
then the above equals 


N71 log(exp((A*/2) S- (N —1-—|n|)R(n))) 


|n|<N-1 
=(N71?/2) > (N-1-In))R(n) 
|n|<N-1 
which converges as N — oo to 


Co 


(7/2) So R(n) = »7S(0)/2 


n=—Co 
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where 


ae n)exp(—jwn) 


is the power spectral density of y(n). This determines the large deviation rate. 
It is more eae auae to ae a formula for the rate at which the empirical 
distribution Ly(.) = N7! ee 1 Sy(n) converges to the one dimensional marginal 
distribution of y(n). a this, we ace to evaluate the limiting Gartner-Ellis 
logarithmic moment generating function 


N 
Ap =limy +N" log(E(exp(>> f(y(n)))) 


[6] Proof of the L?-mean ergodic theorem for wide sense stationary processes 
under the condition C(k) — 0, |k| > oo. 


C(k) = R(k) — p?, uw = E(a(n)), R(k) = E(a(n)2(n + &)) 
Let 7 
= 2 n 


4(Sv/N—-p)?=N7' SYS (1-(1+|&I)/N)C(k) 4 0,N > 0 
|k|<N-1 


provided that 
C(N) > 0 


This follows by the Cesaro theorem: If a, > 0,n > 0, then n~! S07, ax > 0. 
This proves the mean ergodic theorem for wide sense stationary processes. Now 
let x(n) be a stationary Gaussian process. Then fix k € Z and put y(n) = 
(a(n) — Ue)(a(n +k) — py). Then y(n) is also a stationary process with 


(y(n)) = Cx(k), Cy(m) = E(y(n + m)y(n)) — Cx(k)? = 


= El(y(n + m) — C2(k)) (y(n) — Cx(k))] 
= C,(m)? + Cp(m + k)Cz(m — k) > 0,|m| > oo 


provided that C,(m) — 0,|m| — oo. Thus, by the mean ergodic theorem 


applied to y(n), 
N 


NEDSS ya) OC, ) 


n=1 


which is the same as saying that 


NES alm z(n+k) > R,(k) 
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since 
N 
N71 pa x(n) > pe 
n=1 


by the mean ergodic theorem applied to x(n). 


[7] Quantum filtering of cavity resonator fields in interaction with 
a bath. Consider first the TM modes: 


H, =0,E,(t,2,y,2z) = So Re(c(mnp)exp(ju(mnp)t))Umnp (a, ys 2) 


mnp 


By (t,2,y, 2) = So hg? Re(e(mnp)eap(jwo(mnp)t))O-V 1 tmnp(@s Ys 2) 
mnp 
or equivalently, 


E,(t,2,y,2) = )) Rinn (mm /a)(pm/d)Re(c(mnp)exp(ju(mnp)t))Umnp (2, ¥, 2) 


mnp 


Ey(t,v,y,2) = ) 7 Rinn (nm /b)(pr/d) Re(c(mnp)exp(jux(mnp)t)) Wnnp (2, ¥, 2) 


mnp 


where 
Umnp(@,Yy, 2) = ((2.V2) /Vabd) sin(mrx/a)sin(nry/b)cos(prz/d) 
Umnp(2; Y; 2) = —((2.V2)/Vabd)cos(mra/a)sin(nry/b)sin(prz/d) 
Wmnp(2,Y; 2) = —((2.V2)/Vabd) sin(mrax/a)cos(nry/b)sin(prz/d) 
Let < . > denote time average. Then 


/ < E? > drdydz = (1/2) a |c(mnp)|?, 
box 


mnp 


i < E2 +B? > drdydz = (1/2) So le(mnp) 2x4 (mp/d)?(h2,») 


= (1/2) $5 |e(mnp)? ((mp/d)? /Rian) 
where 
2, = 12 (m? /a? +n? /b?) 


The total energy in the cavity due to the electric field is then 
(€/2) | (BE? + |E7)dadydz 
box 


= $5 A(mnp)|e(mnp)/? 


mnp 
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which can be abbreviated to 


AH, = oe w(n)e(n)*c(n) 


where c(n) are annihilation operators of the cavity field and c(n)* are the cre- 
ation operators. They satisfy the Boson CCR: 


[c(n), e(m)*] = d[n — mJ 


Bath field is 
Ex(t,r) = > [Ag )ve(r) + An ()* ee (r) + Ai (Ome (r)] 


k 


where A;(.), Ax(.)*,Ax(.) are the fundamental noise processes in the quantum 
stochastic calculus of Hudson and Parthasarathy. They satisfy the quantum Ito 


formula 
dAydA®, = dpmdt, dAy.dAm = OnmdAn, 


dAydAm = SimdAg, Amd A* = Sinn AX 


Denote the above system electric field E by F,(t,r). Then the total field energy 
of the system plus bath fields within the cavity resonator is given by 


(e/2) f |E.(t,r) + Ep(t,r) (der 


Ignoring the bath energy, the total field energy of the system (ie, cavity res- 
onator) plus its interaction energy with the bath is given by 


H(t) = H, + Hy(t) 
where 
H, = (€/2) | E.(t,) Par, 
y(t) = ef (E(t, r), Ep (t,r))d?r 


= So (Ln (t)d Ax + My(t)dAj, + Nz, (t)dAx) 
k 


with the L;,(t), M,(t), N,(t) being system operators defined by 


Ly(t)h= | de(r)Es(t,r)d?r, M(t) = [de (r)*Es(t,r)d?r, 


box box 


Natt) = f ne(r)Es(t,r)d?r 
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Writing 
gp ).exp(jw(n)t)) Fr (r) 
we get 
Lr (t) = So (lien(t)e(n) + lon (t)e(n)*) 
where 
hun(t) = (1/2)ear(ie(nye) f adalr 
loen(t) = (1/2)exp(—ju(n)t) : be (1) Fr(r)d?r 
Mx(t) = S—(mirn(t)e(n) + marn(t)e(n)*) 
where 
Mikn(t) = (1/2)exp(Jw(n)t) : de (1) Fn(r)dPr 
Mokn(t) = (1/2)exp(—ju(n)t) : Dr(r)Fn(r)d?r 
and 
Nx(t) = So (mirn(t)e(n) + noen(t)e(n)*) 
where 


napn(t) = (2/2)err(—ju(n)t) f mr) Fa(r)d®r = Masnlt) 


box 


Remark: In computing the system Hamiltonian, we must in addition con- 
sider the contribution to the system field energy coming from the magnetic field. 
For the TM case under consideration, this energy is 


Haas = (u/2) [<|Ha(tsn)P > abr 
box 


where 
Hy = (jwe/h?)Vi Ez x 2 


for a fixed frequency and mode or more precisely, with regard to our modal 
expansion, 


A, (t,r) =e. Me h,, 2, Re(jw(mnp)c(mnp)exp(jw(mnp)t)).V LUmnp(r) 


mnp 
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We then find that 
/ <|Hi(t,r)|? > r= 
box 


(2/2) 0 Binmle(mnp)|? 


mnp 


and hence 
Hem = (ne? /4) S> haa le(mnp)? 


mnp 


(a) Fermionic fields as system fields interacting with a photonic bath 
(b) Fermionic bath. The creation and annihilation operators satisfying 
CAR’s are 


Ik (t) = [oan 


0 


Jn(t)* = [caraaney 


[8] Quantum filtering of Yang-Mills gauge fields in interaction with 
a bath 
The Lagrangian density of the Yang-Mills gauge field is 


b=(-1/4Tr(hyk”) = (“haya 
where 
ne = AD ive — cee + eC (abc) Ap, AS 


We have 
Fo, = Aty — Ag, + eC(abe) APA 
T= (=1/2).8o Fo, i 


TS” Ts 


so the canonical momentum corresponding to the position field A® is 
Pt = OL/OA%, = —F%. 


Thus, 
Apo hee eC (abc) A® Af 


The Hamiltonian density is then 
H = Pr Ang — DL = —Fo,Aro + (1/2) ForFor + (1/4) Firs Firs 
The Yang-Mills field equations are 
(D,FY”)* =0 
or more precisely in component form, 


Fev? + O(abc) Ai, HY = 0 
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These field equations can also be expressed in Lie algebra notation as 
(VL, FY"] =0 


where 
Vi = On — teAy = Oy + 1eALTa 


since 
logeee*) = oS 


[Ay FAY = (Abr, PHY r= 
Ab FH’ T, Tol A? FH’iC (abe) Ta 


[9] Quantum field theoretic cavity resonator physics using photons, 
electrons, positrons, non-Abelian gauge Yang mills matter and parti- 
cle fields and gravitons 

[10] Quantum control via feedback 

Quantum filtering and control algorithms were first introduced by V.P.Belavkin 
and perfected by John Gough, Kostler and Lec Bouten. 


dU (t) = (-(@(H + P)dt + [,dA — L5dA* + SdA(t))U(t) 
We take an observable X and note that its Heisenberg evolution is given by 
j(X) = U(t)*XU(t) 
Then 
U(X) = je(Oo(X) dt + je(01(X))dA(E) + Je(O2(X))dA(t)” + Ju(O3(X))dA(t) 


where 0,,k = 0,1,2 are linear operators in the linear space of system observ- 
ables. We take non-demolition measurements in the sense of Belavkin of the 
form 


Y,(t) = U(t)*Y;(t)U(t), ¥;(t) = cA(t) + ZA(t)* + k-A(t) 


The Belavkin filter for this measurement has the following form: 


T(X) = Elje()|no(t)], no(t) = o(Yo(s) : s < t) 


dr(X) = Fy(X)dt + 3 Gue(X)(d¥0(t))* 
k>1 
where F;(X), Gxe(X) € no(t). It is a commutative filter and is therefore called 
the stochastic Heisenberg equation. Its dual is the stochastic Schrodinger equ- 
tation: 


dpt = Fy (pz)dt a .- Git (pt)(dYo(t))* 
k>1 
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Let Xa(t) be the desired Heisenberg trajectory. Then, the tracking error at 
time t is Xq(t) — j:(X). However, we cannot feed this error back into the HP 
noisy Schrodinger equation because we cannot measure j;(X) directly without 
perturbing the system. So we use in place of j;(X) its real time estimate 7,(X) 
based on the non-demolition measurements 7,(¢) upto time t and feedback in- 
stead the error Xy(t)—7;(X). The system dynamics after feedback is then given 
by 


dU (t) = [(-iH + u(t) — P(t))dt + Lyd A(t) — Lod A(t)* + SdA(t))U(t) 


where 


u(t) = K(Xa(t) — m(X)) 


or more generally, 
u(t) = Lf(Xa(t) — m(X)) 


We note that 


dY,(t) = dY;(t) + dU(t)*dY;(t)U (t) + U(t)*dY;(t)dU(t) = 


= dY;(t) — j(cLo + GL4)dt + kje(S + S*)dA + hje(Li — L4)dA + kj(L* — La) d A* 


So measuring dY, amounts to measuring —j,(cL2 + €L5)dt plus noise. In the 
context of cavity resonator physics, we have that —Lz is the coefficient of dA* 
in the HP equation and as we saw, this coefficient is proportional for the k*” HP 
mode to D>, (lien (t)e(n) + loen(t)e(n)*). This means that our non-demolition 
measurement corresponds to measuring some projection of the cavity electric 
field plus noise. Actually, we can construct a whole class of non-demolition 
measurements that correspond to measuring several projections of the cavity 
electric field plus noise. 


[11] How to apply machine learning methods to problems in elec- 
tromagnetics, gravitation and quantum mechanics 

Given an incident em field (E;(w,r), H;(w,r)) incident upon a diseased tis- 
sue characterized by an inhomogeneous permittivity tensor €,,(w,7) and an 
inhomogeneous permeability tensor lap(w,7r), we determine the scattered em 
fields (E,(w,r), H,(w,r)) after it gets scattered by the tissue. The aim is to 
estimate the permittivity and permeability and derive characteristic features 
of these using a neural network and match these characteristic features with 
prototype features to determine the nature of the disease. We train the neural 
network to take as input a set of incident-scattered field pairs and output the 
permittivity-permeability parameters. Then when the neural network is pre- 
sented with another incident-scattered field pair, it will use its trained weights 
to generate the permittivity and permeability parameters which can be com- 
pared with the prototype. 

In quantum mechanics, machine learning can be applied as follows: Let H(6) 
be the system Hamiltonian dependent upon an unknown parameter vector to 
be estimated from repeated measurements of the state taking into account the 
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collapse postulate. The pz denote the state after the k'” measurement taken at 
time t;,, and let {/,} denote the POVM. Then, the state after the measurement 
has been taken at time t,41 is given by 


Pro = x V MU (tri — te, 9) prU (thi — th, O)*V Ma 


if we make the measurement without noting the outcome and if we note the 
outcome as a, then the state at time t,41 just after the measurement has been 
made is given by 


pri(a) = [\/M QU (thoi — tes 9)pnU (thar — te, O)* V Ma] /Tr (numerator) 


Here 
U(t, 0) = exp(—itH(@)) 


The joint probability of getting measurement outcomes aj, ...,a,% respectively at 
times t) < to <... < ty is given by 


Pr(ay,..., pj ti, 5 th|O) = 
Tr(/ Ma, ) 
[12] Lattice filters and the RLS lattice algorithm: 
X(n) = [x(n), x(n —1),...,2(0)]”, 
Xnip = [27 X(n), ...5 2 2X (n)] 
e(n|p) = X(n) — PapX(n) = PripX(n) 
where P,,,, is the orthogonal projection of R"t! onto Range(Xn_»)- 
ex(n.— Ip) = Phyz-P-1X (1) 


Let y(n) be another signal. Write 


Let 


Y (n|p + 1) = PrtiptiY (n) 


Note that Pn+ijp41 is the orthogonal projection onto span {X(n), 271 X(n), ...,27?X(n)}. 
We can write 


Y (nip +1) = Xnsipsthnp4i 


We can also write 


er(n|[p) = X(n) + Xn pan,p, eo(n — 1lp) = z P-1 X(n) + Xn pbn—1,p 
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Update formulas: 
Papi _ Prip ae Pps z-P-1X(n) 


Thus 
Paptt = Pay — Peg ,z-r-1x(n) 
att 
= awe a Pey(n—1lp) 
Hence, 


ep(n|p + 1) = e¢(n|p) — ex(n — 1p) < ex(n — 1|p), X(n > / || ex(n — 1p) ||? 


= e;(n|p) — en(n — 1[p) < eo(n — 1p), e¢(n|p) > / || eo(m — 1p) [I? 
= e(n|p) — K(n|p + 1)eo(n — 1 |p) 
from which, it follows that 


ll er(nlp + 1) [=I] eg (nlp) |? —K (nlp + 1)? || eo(m — Up) |? 


ep(n|p + 1) = en(n — |p) — K(n|p + les (nlp) 
and hence 
|| ex(n|p + 1) |?=l] e6(m — 1p) ||? —K (nlp + 1)? |] e¢ (nlp) ||? 
We also easily see using the Gram-Schmidt orthonormalization process that 
Y (nlp) = Xpnsiptilnpsi = 
X(n)hy(0) + 271 X(n)An(1) +... +27? An(p) = 
€4(n|0)gn(0) + ev(n|1)gn(1) +... + ev(n|p)9n(P) 


where 
Gn(k) =< Y(n), en(n|k) >, k = 0,1,...,p 


and hence 
Y(n|p + 1) = Y(nlp)+ < Y(n), eo(n|p + 1) > e(b(n|p + 1)/ || e(n|p + 1) |? 
or in other words, 
gn(p + 1) =< Y(n), eo(n|p + 1) > / || es(n|p + 1) |I? 


We now look at time updates: 


ey 
Xn+1 = yp 
»P Xn,p 
T 


nap = [x(n), a(n — 1),..., a(n — p)] 


where 
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Then, 
Onp = —(XF. Ao ae eX A )= —RapXn, pX(n) 


and 
Rn+1,p > (Kini Xan) = Rn p + aan 


Application of the matrix inversion lemma then gives 


RLS lattice algorithm continued 
x[n],n > 0 is a process. Let 


tn = [z[n],2[n — 1],...,2[0]]7 € Rt, 
and let 
z "a, = [a[n — k],2[n — k — 1],...,2[0],0,0,...,0])7 € R"*? 


ie we interpret z~*x[m] = 2[m—k] = 0, if k > m. Forward predictor of order p 
at time n: 
er(n|p) = Pets 


where P,,,» is the orthogonal projection onto R(X,,,) where 
Kap = [2 ye Say uae Pa, eRe 
Backward predictor of order p at time n — 1: 
ep(n — 1|p) = Pepe te, 
From the basic theory of projection operators in Hilbert space, it follows that 


es(n|p + 1) = e¢(n|p) — K¢(n|p)eo(n — 1|p) 


eo(n|[p +1) = ( a 1p) ) ~ Ky(nlp) ( ow) ) 


where 
Ky (n|p) =< es (n|p)k, en(n — 1p) > /Ey(n — 1|p), 
Ky(n|p) =< ez(n|p), eo(n — 1|p) > /Es (nlp), 
E;(n\p) =|| e¢(nlp) |!?, Zo(n — 1p) =|] eo(n — 1p) 
Then, 


Ex(n|p + 1) = E(n|p) — Ky (n|p)? Ey(n — 1p) 
Ey(n|p + 1) = Ey(n — 1 |p) — Ko(n|p)? Ey (nlp) 


er(n|p) = fn + Xn pOn,p 


ep(n = 1|p) = gs, ote Xn pnp 


a -lyT _ -1,—p-1 
On,p = Fa p*n,ptns bn,p = =e p® Ln 
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where 
T 
Rap = Xn pnp 
Xn+1,p = ene ey aan = [x[n], z[n as i; sees x(n +1—- P|] 
Also, 
Xn pt — [Xn.p gg | 
Then, 
Rn+1,p =z Rnjp ae eens 
- Rn jp Dee ema Ss 
Raps = ( z-P-Igl x, | z Ply, \|? 
Also, 
Xniptl = [eae | 
and hence 
Xn+1,p+1 = eta ee ate | 
where 
Zz En41 = [x[n], x[n = 1], xy (0), o]* = eas Oe 
and 
eo npte= (GA 
and hence 
(271 Xn41p))” (2 Xnt1p) = Xn pnp = Rn.p 
Then, 


ll Zn I|? ee 
Rn+ip+1 = ( oe 
*P DS Rrjp 


Note: We have that e,(n — 1|p) = Px,z~?~!an and hence e,(n|p + 1) = 
Pirie oe oaks and further, 


Xn+1,p+1 = [a reps “ea amar re _ [ln 2 oh ue an 5 o]* 


= erieXnal 3 gl” 
and since z~?~?an41 = [z~?- a7, 0|", we easily see that 


en(n|p +1) = Bee pe to Died = [(Pii((en,Xn pl)? En)" 0)” 


and 
PR([@n,Xn,pl) = PR((es(nIp),Xn,pl) 


from which it follows that 
ep(n|p + 1) = Cape ae ae < zP-ten, ef(n|p) > er(n|p)/Ex(n|p))* , 0)” 


= [(eo(n — 1p) — Ko(n|p)es(n|p))*, 0]* 
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We write . 
Appia ise og aha”, 
k=1 


P 
Bal?) = zP + ys bup(k)z 
k=1 


Then, we can write 


er (n|p) = Anp(Z)tn, ep(n _ 1|p) = go Be lz) Dey = Bgl2ye. te 


Thus, 
An,p+1(Z) = Aas (2) — K;(n|p)z* Bar p(2), 
Bn+i,p+1(2)fn = 2 Bap(z) — Ko(n|p)An,p(2) 
Remark: 
gl _ T wT 
ep(n|p + 1) = 2° Brtipti(z)tngi = Bn41,pti(Z) [en 50] 
= [Basi prile)as 0)" 
An+1,p+1 = a ra eee, Cte pepe = 
=z 
_( Wenl? to Xnp T 
( Ke De Res (En,ptit[n + 1] + Xn ptitn) 
where 
ae = [x[n], z[n . ij, ay x(n +1—- P|] 
Note that 
Xn+1,pt1 an fageetne cere 
Also, 


M4 pone = ExpieX, otalleln +1), 02]? =2[n + ene + XT eit 


Remark: We now derive a useful formula for the inverse of a block structured 


symmetric matrix: Let 
a b 
r= 2) 


<i ee q¢ “pe 
: -(f a |) 


qa+p’b=1,qgb+p’R=0, 
pb’? + QR=I 


where R? = R. Writing 


we find that 
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and hence, 
qg=1/(a—b'R™'b)) 
and 
p= —qR-1b = —(R~'b)/(a — 67 R-b) 
Q=(I—pb")R-* = R14 Robb Ro /(a— 6" Rb) 
Taking 
a =|| In I|?, b= Aptinde = Rnjp 
gives us 


q = (Il tn |? +2, Xnpanp)! = 1/tq (an + Xn,panp) = 1/E¢ (np) 


and 
p= An,p/ E's (n\|p) 
Q= en a Rp hagintsapla ny (n|p) 
— ‘es + Onp%y »/ Ey (nIp) 
Remark: 


es (n|p) = tntXnpnp, £4 (nlp) =I er(nlp) [P= 2p (Gn tXnpanp) = tres (nlp) 


Then, 
Onti,p+1 = [gr[n + 1]x7 
Rrtip = y. alr ae. Coe _ 
Snipes 6 + Rap 

Thus, 

a = T = 

an+1.p = — Ris pXn41,ptnt — 

= —(Ri» = eee ae ar tn,p)) (En,p2[n jot Xe ete) 

= an,p os, kn p(a[n + 1] al Ete) 

where 


Ln,p = Re pea pies = [z[n], e[n = 1], a) x(n aol pil) 
™n,p = SAS cane ae = Ea lies 
knp = Mn,p/(1 + 1n,p) 


2] 
Mn+1,p+1 = Paya ptiSntl pt) = 


( 1/E;(n\p) Gn p/ Es (nlp) ) ( a[n + J] ) 


Qn,p/ Ey (n|p) Re + On,pOn »/ EF (n|p) En,p 


= ( a[n + 1]/B¢(n|p) + an p§n,»/Ep (nlp) ) 
An,px[n a 1]/Es(n|p) fl Ln,p 
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RLS lattice algorithm continued: 


ep(n|p + 1) = ef (n|p) — Ks (n|p)en(n — 1 |p), en(nlp + 1) = [&(n|p +1)", 0)” 


where 
en(n|p + 1) = e4(n — lp) — Kp(n|p)es (nlp) 
Then, 
es (n|p) = tn + Xn pan,p, eo(n — l|p) = ze Pole, + Kip ba—1 
Onp = ae ee bn-1,p = SR ee a 
Rnip = pe. 
Rn+1,p = See + Rip 
Rati = Rap ~ Hnplnp/ (+ Mp) 
n+1,p N,p Ln,pEnp Tn,p 
Note that 


T 
Xn41,p = ( x. ) 
np 


Ln,p = Recap 
— oe ii 2: 
an+1,p = Risa pXn41,ptn41 = 
—(Rab 7, Teme @ + Mn,p))(Xn,pu[n el Xe ota) 
=‘Gnp — (tnp/(1 + tap) (ain + 1) + & pan.) 
=,5 — hn pes(n + 1\n,p) 
Mn+1,p+1 = Reo pag Cate e 


Py ae ao 
TENET Nel aie: — tt 
»P , 


Rehign = ( gi/Be eLolBstle) 
URS An,p/E's(n\p) (Ris + On,pAn p)/Es (n|p) 


RLS lattice continued: 
tind > = Gey — Raper lich lip) 


where 
er(nt+1|n,p) =a[n+1)+ Cae 


Note that 
er(n|n, p) = z[n] + CF dln 
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Is the first component of ef(n|p). e¢(n+1|n, p) is the first component of e¢(n+ 
1|\p) but computed using the filter coefficients a,,, which are estimate on data 
collected only upto the previous time n. 

We have seen that 


knp = Ln,p/(1 + Mn.p)s Ln,p = Ry poral + A Se eee 
We have also derived a time-order update formula relating fin4ip41 With np. 
This formula is given by 
Mn+ip+1 = ler(n+1|n, p)/Es (np), 
(UnptE¢(n|p)—* (Gn,p% pEn,p—ZIN+1] Rp p Xp pEn)) |” 


= [ep(n + In, p)/Es(nIp), Un» + Ep (nlp) tes (n + In, p)an pl 
The whole logic of the RLS lattice algorithm is to keep computing order and 
time updates for each of the variables that occur in the process so that finally 
the recursion closes, ie, gets completed. If at any stage, the recursion does not 
close, we introduce new variables and then compute order and time updates for 
the newly introduced variables. This process goes on until finally, the algorithm 
closes upon itself. We find that 


E,(n +p) =I e(m + 1p) [=I] east + Xntrpdnsip [P= 
s(n + Up) = tna + XntiplAnp — knpes(m + 1\n,p)) 
= [es(n + 1|n,p) — EF pknpes(m + 1In,p), (ef nlp) — Xn.pknpep(n + Un,p)) 7]? 
= [(1 + tap) teg(n + Un, p), (ep (nlp) — Xn,phinpep(m + Un, p)) I" 


Thus, taking the norm square on both the sides and using the fact that e;(n|p) 
is orthogonal to R(X;,,p) gives us 


Es(n+1|p) = e¢(n+1|n, p)?(1+1n,p) 7+ Ez (np) +e¢(n+1|n, p)?Mmn,p/(1+M,p)? 


=es(n+1|n,p)?/(1 + tmp) + Ey (nlp) 


We now repeat this analysis for the backward prediction errors and filter coef- 
ficients. First observe that 


K y(n + 1p) =< ef(n + 1p), eo(n|p) > /Ex(n + 1[p) 


ep(n|p) = Px. 2 Panga 


es (nlp + 1) = tn + Xnjp414n,pt1 = in + [Xn p, oy Pl pe Gn,p+1 = 


es(n|p) — Ks(n|p)ey(n — |p) 
= (t+ Xnpting) — Ky(nlp\(¢? tn + Xnphn-ay) 
= Xn + Bere ame os (eae aa Ky (n|p)bz_1 >)"; —K5(n|p)]" 


and hence, 
a = ( an,p — Ky (n|p)bn—1,p ) 
n,p+l — 


—Ky(n[p) 
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Likewise, 
ep(n|p + 1) = g PA eg 4 + Xn+1,p+1onp+1 = 
ae mT a [ea Pe | eee 
0 
_ ( ev(n—1|p) — Ko(nlp)ez (nlp) 
0 

= ( Zn + Xn pbn-1p — Ko(nlp) (tn + Xn,pAn,p) ) 

~ 0 
and hence 


i Seah ( : —Ky(n\p) ) 


n—-1,p — Ky (n|p)anp 


These represent respectively the order update formulas for the forward and 
backward prediction filter coefficients. We have computed in addition the time 
update relation for the forward prediction filter coefficients. We now do this for 
the backward prediction filter coefficients. 


—lyT ,-p-1 
bn—-1,p = — Rap Xn p% Ln 
= —l T =pHi = 
bnyp = | ae ta Tn41 = 


> —(R,5 ae [amen ary 6 + Mn.p)) (En,pt[n = P| + RG pe PT ee) 
= bn-1,p — kn p(a[n =| se f7 gUaaty) 
= bn—1,p aa kn per(n|n = 1,p) 


We now need to compute E;(n|p) in terms of E,(n—1|p) and also N(n+1|p) =< 
er(n+ 1|p), eo(n|p) > in terms of N(n|p). We recall that 


er(n ae 1|p) =a (1 me Nnp) er(n =F 1|n, p), (ef (np) = Xn pkn,pe s(n re 1\n,p))" |" 
and 
ep(n|p) = zw Peay a Xn+1,pon.p = 


x[n — p] + eae (ee = kn per(n|n = 1,0) 
2? len + Xnp(On—1,p — Kn,pes(n|n — 1, p)) 


ev(n _ 1|p) a Xn pkn,per(n|n = 1, p) 


_ (1+ up) ten(n|n — 1, p) ) 


It follows then by forming the inner product of these two relations that 
N(n+1|p) = (1+1n,p) “es (n+1|n, p)es(n|n—1, p)+ < ef(n|p) 
—Xn pknpes(nt+1|n, p), eo(n-1|p)—Xn pkn,per(n|n—1, p) > 
= (L+n,p) *ep(n+1\n, p)ev(n|n—1, p)+N (np) + (tin p/(L+tnp)" eg (n-+1 In, p)es(n|n—1, p) 
ep(n + I |r, p)eo(n|n — 1,p)/(1 + tn,p) + N(nIp) 
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Note that we have used the orthogonality relations 


XP er (n|p) = X72 pen(n — 1|p) = 0 


[13] On a gadget constructed by my colleague Professor Dhananjay 
Gadre 

In the standard undergraduate course for engineering students called ” Sig- 
nals and Systems”, as well as in the associated laboratory courses, the students 
are taught about how to generate various kinds of periodic signals like the 
sinewave, the square wave, the ramp wave etc., how to construct the Fourier 
series of such periodic signals as linear superspositions of higher harmonics of 
a fundamental frequency and how to design and analyze lowpass, highpass and 
bandpass filters that would filter such periodic signals so as to retain only a 
certain small subset of the signal harmonic components. This problem is im- 
portant, for example, in a situation wherein a circuit designed on a bread-board 
which receives its input from a power supply gets the current/voltage across its 
components corrupted by the fundamental as well as higher harmonics of the 
basic 220-V AC voltage source. This happens because the power supply runs on 
the AC source and hence some part of the latter signals enter into the circuitry 
of the power supply. As a result, when we measure the voltage across any of 
the circuit’s elements using a CRO, we observe stray components coming from 
the AC source and its higher harmonics on the CRO screen. These harmonics 
are not desirable and hence may be regarded as constituting a component of 
the noise in addition to the thermal noise being produced in the resistors of the 
circuit. One way to get rid of these harmonics is to place a filter between the 
circuit and the CRO to eliminate these stray components. The trouble with 
such a method is that the filter will consume some of the current and will there- 
fore act as an undesirable load. It would thus be better to have a CRO which 
automatically does the filtering and hence gives a faithful representation of the 
circuit’s behaviour in the absence of the AC source fundamental and higher 
harmonic disturbance. Another example where such a filtering is required dur- 
ing the signal recording process is a telephone line in which speakers A and B 
communicate across a line and due to defects in the line, A hears not only B’s 
speech transmitted over the line but also his own echo. An echo canceller at 
A’s end will use a filter H(z) which may even made adaptive and which takes 
A’s original speech as input and predicts the signal received by A over the line 
(namely B’s speech plus A’s own echo). Since A’s speech is correlated with 
only A’s own echo and not with B’s speech, the filter H(z) will predict only 
A’s echo component in the total signal received by A. Thus, when the filter’s 
output is subtracted from the signal received by A, the result is that A gets 
the signal spoken by B with a major part of his own echo cancelled out. If we 
have a CRO that passes through only B’s frequency components and rejects 
A’s frequency components (this is possible only when A’s and B’s speech sig- 
nals occupy non-overlapping frequency bands. If not, we can shift the band of 
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frequencies occupied by B’s speech via appropriate modulation at B’s end and 
use a line which supports both the band of frequencies spoken by A and B’s 
band shifted in frequency via modulation). Then, by just recording the signal 
received by A at his end using such a CRO, A can get to know B’s speech with 
his own echo removed. 

Another application of such a frequency sensitive CRO is in generating sine 
waves from a square wave. Square waves are easily generated using a switch ina 
series circuit either turned on and off after fixed durations of time manually or by 
arotating motor. A square wave contains all the harmonics of a fundamental and 
hence a CRO which passes the first N harmonics with variable N can be used 
to demonstrate Gibb’s phenomenon on the behaviour of the sequence of partial 
sums of a Fourier series near a discontinuity of the signal. Specifically, we can 
demonstrate that at the discontinuity point of a square wave, the Fourier series 
converges to around 1.8 times its amplitude. A CRO with variable bandwidth 
can in fact be used to demonstrate all kinds of behaviour of the partial sums of 
Fourier series including the fact that at a discontinuity point of the signal, the 
partial sums converge to the average of the signal amplitude at the immediate 
left and at the immediate right of the signal. 

A CRO at the quantum scale level can also be used to determine the energy 
levels of an atomic system. Specifically, if |n >,n = 0,1,2,... are the energy 
eigenstates of an atom with corresponding energy levels E,,,n = 0,1, 2,... re- 
spectively, then it is known from Schrodinger’s equation that if the initial state 
of the atom is the superposition |7)(0) >= 5>,, c(n)|n >, then after time t, its 
state will be |W(t) >= DO, c(n)exp(—iw(n)t)|n > where w(n) = 27E,,/h, h be- 
ing Planck’s constant. Thus, if X is an observable of the atomic system, after 
time t, its average value in this state will be given by 


< X > (t) =< ¥(t)| XW >= S- e(n)e(m) < n|X|m > exp(i(w(n) — w(m))t) 


nym 


when this signal is input into the quantum/nano CRO of the kind described 
above, it retains only a small finite subset of the frequencies w(n) — w(m) and 
by adjusting the bandwidth of the CRO, we can thus determine from spectral 
analysis of the signal appearing on the CRO, what exactly are the energy level’s 
of the atom or more precisely, what frequencies of radiation can be absorbed 
or emitted by the atom during transitions caused by perturbing the atom with 
an external radiation field. Another situation in quantum mechanics that finds 
application here is related to the notion of repeated measurement and state 
collapse. Suppose a quantum system has initial state o(0). It evolves for time ty 
under the Hamiltonian H to the state p(t;—) = U(t1)p(0)U(t1)* where U(t) = 
exp(—itH). Then a measurement is made using the POVM M = {M, : a = 
1,2,...,N}. If a(1) is the noted outcome, after taking this measurement, the 
state collapses to 


pltit) = 4/Maayp(ti—)\/Macay)/Tr(e(t1—) Macy) 
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Again the system evolves for a duration tz — t, to the state 
p(t2—) = U(te — ti) p(tit)U (te — ti)” 


and again the measurement M is made and if the noted outcome is a(2), the 
state collapses to 


p(ta+) = 4/ Maca) p(te—)\/ Maca) /Tr(p(t2—) Ma2y) 


It is then clear that after N such operations, namely, free evolution under the 
Hamiltonian H followed by applying the measurement M and noting the out- 
comes at each state, the final state of the atom p(ty+) is expressible as a ratio 
of two terms. The denominator is a real number and is of the form 


ys C(ni, ...,2N} M1, ..., MN )exp(—t(w(n1)—w(m1))ti+(w(n2) 
Cee Cn STC eO NC ean) 


where C(n1,...,2n) are complex numbers while the numerator is of the same 
form but with the C(n1, ...,ny) being operators. In other words, the numerator 
and denominator are multidimensional sinusoids in the variables (¢;, ...,¢~7) and 
if we have a generalized spectrum analyzer for multidimensional signals, then 
we could determine by measuring after time N, the average of an observable 
in the system state, the frequencies w(n), ie, the energy levels of the atomic 
system Hamiltonian H as well as the initial state (0) and the structure of the 
measurement operators M,. In fact, more can be said about this problem. The 
joint probability of getting a(1),...,a(V) during the above measurement process 
at times ¢1,...,¢ is given simply by 


P(a(1),...,a(N)\t1, tw) = 


Tr(B(a(N))U (ty—tw—1)..-E(a(2))U (to—t1) E(a(1))U (t1) o(0)U (t1)* B(a(1)) 
U (ty—ty)* E(a(2)...U (ty —ty_1) E(a(n)) 


E(a)=/M, 
This joint probability is clearly a superposition of multidimensional sinusoids 
with frequeny-tuples (w(n1),w(n1) — w(ng),...,w(rN) — w(ny_1)) with complex 
amplitudes and a harmonic analyzer of multidimensional sinusoids will be able 
to determine these frequencies and hence estimate the atomic energy levels. 


where 


[14] Summary of the research carried out at the NSUT on design of DSP 
filters using transmission line elements, design of water antennas, design of an- 
tennas based on microstrip cavities of arbitrary cross sectional shape, design 
of antennas using microstrip cavities filled with material having inhomogeneous 
permittivity and permeability and design of fractional order filters using trans- 
mission line elements. 


Acknowledgements: Informal discussions with Dr.Mridul Gupta. 
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Introduction: A lossless transmission line has the natural property of being 
able to generate transfer functions with arbitrary fractional delay. The reason 
for this is that if the input forward and backward voltage amplitudes to the line 
are Vi, and V,_ and the corresponding output forward and backward ampli- 
tudes are V2,,V2_, then if d is the line length, and 6 = w/u,u = VLC, then 
then elementary transmission line analysis shows that 


Vo_ = Viterp(—jBd), V2, = Vi_exp(j Gd) 


or equivalently, if Ro is the characteristic line impedance, then the voltage and 
current at the input and output terminals are related by using 


Y= (MU4++N-), 4 = (M4 —Vi-)/Ro, 


Vo = (Voy + Vo_), Io = (Va4 — Va_)/Ro 


Working in the wave domain, ie, in terms of forward and backward voltage wave 
components, we have that 


Ce )=( eens 0?) VE) 


since 6 = w/u, the factor exp(—jGd) corresponds to a delay by d/u seconds 
and hence if time is discretized into steps of A seconds, the factor eap(—jZd) 
produces a delay of d/uA samples which need not be an integer if d chosen 
appropriately. Thus it becomes evident that by connecting in tandem several 
such Tx line units in conjunction with stub loads, lines having any transfer 
function with numerator and denominators being superpositions of arbitary 
fractional powers of the unit delay Z~! can be synthesized. Specifically, consider 
connecting in parallel to this line of length d, an impedance Z, at the input 
end. Then, the voltage at the input end remains unchanged for fixed voltage 
and current at the output end while the input current gets modified to Ij = I, + 
V,/Z,. We then find that for fixed Vo,,V2_, Vi4, Vi- get modified respectively 
to 
Vin = (Vi + 1, Ro)/2 = (Vi + i + Vi/Z1) Ro)/2 


and 


Vi = (VY — T,Ro)/2 = (UW — (1 + Vi/Z1) Ro) /2 
or equivalently 
Ving = (Vi4 +Vi-)(1 + Ro/Z1)/2 + (Vi4 — Vi_-)/2 


= (1+ Ro/2Z1)Vi4. + (Ro/2Z1)Vi_ 


and 
Vie = (Vi4 +Vi-)(1 — Ro/Z1)/2 — (Vig. —Vi-)/2 


= —RoVi 4/221 + Yi-(1 _ Ro/2Z1) 
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This means that connecting a load in parallel at the input end amounts to 
mutiplying the S-matrix given above by the matrix 


Lay /27:; . RefGZ, \* 
—Ro/2Z, 1- Ro/2Z1 


to its right. Since Z, can be any function of jw, this suggests that arbitrary 
transfer functions with fractional delay elements can be realized. When we have 
a sequence of 2 x 2 matrix transfer functions say T;(z),k = 1,2,..., N connected 
in tandem, the overall transfer matrix is given by 


Snw(z) = T1(z)T2(z)...Tn (z) 


or equivalently, in recursive form, 


Swi(z)(1, 1) = Sw(z) (1, 1) Tw4i(z)(1, 1) + Sw (2), 2)Tw4i(z)(2, 1), 
Sw4i(z)(1, 2) = Sw(z)C, Tinga (2)(1, 2) + Su (2), 2)Tiv41 (2) (2, 2); 
Sw4i(2)(2, 1) = Sw(z)(2, )Tw4i(2)(1, YD) + Sw (2) (2, 2)Tw41(z)(2, 1), 
Sw4i(z)(2, 2) = Sw(z)(2, 1) Tn4i(2)(1, 2) + Sw (2) (2, 2)T 41 (2) (2, 2); 


Sometimes, it may be easy to solve these difference equations. Usually, one 
defines the scattering parameter s2)(z) corresponding to a given transfer matrix 
as Vo_/Viilv,_=0-. This this tells us how much amplitude is transferred from 
the source to the load without reflection. The scattering matrix S is defined via 


Ci <8) 


The elements of S are related to those of the transfer matrix T as follows: 


Y= S(1, 1)Vi4 + S(1, 2)Vo4, Vo = S(2, 1)Vi4 ity S(2, 2)Vo4 


Vox = TAM 4 4 TLR Vee = TQ) 4 TO, 2) 


[15] An example in quantum mechanics involving two-port parameters: Let 
Hz, k = 1,2 be two Hilbert spaces and let pz, k = 1,2 be two density operators 
defined in the Hilbert spaces Hz,k = 1,2 respectively. Let H,,H2 be two 
Hamiltonian operators defined in the Hilbert spaces H,,H2 respectively. Let 
V,, V2 be two self-adjoint operators defined in H1,H2 respectively. Let Viz be 
a self-adjoint operator defined in H,; ® Hz. Let H = H, ® He be the Hilbert 
space of a quantum system having time varying Hamiltonian 


A(t) =H, @In+h ® Ap +Viet fil)Vi + fo(t)V2 


= Ao t+ filt)Vi + fo(t)V2 
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where 
Ho = A, @Ig+ 1, ® Hy + Viz 


is the unperturbed Hamiltonian. Note that the perturbing signals f,(t), fo(t) 
are respectively applied to the first and the second component Hilbert spaces 
respectively. Then, let the initial state of this quantum system be 


p(0) = pi @ pe 


After time t, the state of the system is 


where 


U(t) = T{exp(—i de H(t)dt)} 


and if we take two observables X,, X2 respectively in the component Hilbert 
spaces H1,H2, then the averages of these observables at time t are respectively 
given by 

< Xi > (t) = Tro(p(t)(X1 @ Ia) 


and 
< X2 > (t) =Tri(p(t)(h @ X2)) 


Problem: Compute these averages upto linear orders in f;(t), fo(t) and explain 
how this defines a two port system. Specifically upto linear orders in fi, fo, 
show that we can write 


<X,> (t) = ay (t) + fiatrirar+ f malt. r)falrar 


< X9> (t) = a2(t) + i har(t, T)fi(r)dr + ii hoo(t, T) fa(r)dr 
where the signals a(t), a@2(t) are independent of f,(.) and fo(.). 


[16] Detecting whether supersymmetry is broken or not using a 
deep neural network 
For an Abelian gauge-superfield, a gauge invariant action has the form 


Ly Se, fete + con ysey"O,,/r +e)? 


This gauge action (which is always Lorentz and gauge invariant) is supersymme- 
try invariant only for vectors [c;, c2, c3] belonging to a one dimensional subspace 
of R°. The problem of determining whether supersymetry is broken on not thus 
amounts to estimating these constants from measurements. Suppose we allow 
the gauge particles and their superpartners to interact with a Dirac electron. 
Then we would have to add the matter action with this gauge action and hence 
determine the effect of this gauge perturbation upon the matter action and write 
down the dynamics of the Dirac electron taking into account these interactions. 
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From the transition probabilities of the Dirac electron under these gauge pertur- 
bations, we can estimate the constants c,,c2,c3 and hence determine whether 
the gauge action respects supersymmetry or not. 

In the non-Abelian gauge situation, the complete supersymmetric Lagrangian 
density for the matter and gauge fields is given by (Reference:S.Weinberg, ” The 
quantum theory of fields vol.III) 

L= 


—(Dy)n(D"O)n me (1/2) bry" (Dy) n + FrFn 
—Re((f""(¢)) nme eUnt)+2Re((f'(b))n Fa) ter Im(UnrrA (ta)nmom—C1lM (Un RrA (ta)nmOm 
—$7,(tA)nmomDa—EaDa+(1/2)DaDa—(1/4) fap fh’ —(1/2)Aay" (Dy A)atere(uupo) fa” £47 


The field equations are obtained by setting the variational derivatives of L w.r.t. 
all the fields nr, ¢nrsUnr> V7 Rs Ons Ons Fn, FR, Vay, AA, Da to zero. In partic- 
ular, we obtain a Dirac equation for w, with the operator 0, replaced by the 
gauge covariant derivative D, with the gauge field being Vy, that define the 
gauge field tensor F'4,,, and with masses dependent upon the scalar field and 
sources determined by the coupling between the gaugino field \4 and the scalar 
field ¢». The dynamical equations for the gaugino field \4 are again of the 
Vay-gauged Dirac form but with zero masses and with sources determined by 
coupling terms between the scalar field and the Dirac field. Note that the gauge 
field V4, and the scalar matter field @, are Bosonic while the Dirac field y, 
and the gaugino field 44 are Fermionic. The super-Dirac and gaugino equations 
have the form 


(i7" Dy — m(¢))¥n = Xin(d; Aa); 
iy" (Dir) a = X2(¢,¥) 


The super scalar field equation is now of the V4,,-gauged Klein-Gordon form 
but with a source term determined by a coupling between the Dirac field and 
the gaugino field. Specifically, it is of the form 


DED dn = x3(v, Aa) 


Deriving these field equations from the above supersymmetric Lagrangian in- 
volves first eliminating the auxiliary fields F;,,D,4 by noting that their varia- 
tional equations yield ordinary linear algebraic equations for them in terms of 
the scalar field. Finally, the field equations for V4, or equivalently, form 


fApy = Vary — Vay + C(ABC)Vp,Vev 


are given by the usual Yang-mills equations but with source current terms now 
receiving their contribution from all the three fields:the scalar field, the Dirac 
field and the gaugino field. These three current terms add up without interfer- 
ence and contribute to the dynamics of the Yang-mills gauge field. 


[17] Review of a paper given by Mridul 
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This paper studies a certain kind of periodically forced nonlinear oscillator 
having a delay term in its dynamics as well as a noisy forcing term bilinearly 
coupled to its position. The delay terms as well as a white Gaussian noise term 
are parametrized by small parameters. The periodic forcing term is also bi- 
linearly coupled to the oscillator’s position process. The complete dynamics is 
described by a system of two first order differential equations in phase space. 
The analysis starts first with linearizing the dynamical system and obtaining the 
characteristic equation for the roots of the linearized system. This characteris- 
tic equation is a quadratic plus an exponential function, the exponential term 
arising due to the delay term. In short, the characteristic equation is a tran- 
scendental equation, not soluble in closed form. The authors apply the implicit 
function theorem to this characteristic equation obtaining thereby a formula for 
the sensitivity of the roots (the roots are also called the Lyapunov exponents 
as they give us the rate at which a small perturbation in the initial conditions 
diverges exponentially) w.r.t the perturbation parameter. After this brief analy- 
sis, the authors assume that the Lyapunov exponent is purely imaginary, ie, the 
linearized differential equation exhibits purely oscillatory behaviour, and derive 
closed form exact formulas for the oscillation frequency in terms of the pertur- 
bation parameter (6. After this, they assume that the process can be expressed 
as a slowly time varying amplitude modulating a sinusoid (just as in amplitude 
modulation theory) (they have not stated this but it is implicit in equn.(18)) 
and assume that the delay affects only the sinusoidal carrier part and not the 
modulation amplitude. I think this portion of the author’s analysis requires a 
more detailed explanation. With this approximation, the authors are able to 
get rid of the delay term and thereby obtain a standard Ito-stochastic differen- 
tial equation without delay for the phase, ie, the position and velocity variables 
(equn.(22) and (26)). In (31), the authors define a stochastic Lyapunov expo- 
nent for the Ito sde solution (3). I think that this portion must also be explained 
more clearly in terms of the ergodic theorem for Brownian motion. Specifically, 
if A(t) is the stochastic state transition matrix for a linear sde, then for the 
purpose of defining stochastic Lyapunov exponents, we must ensure conditions 
under which the limit of t~tlog(A(t)) exists as t + co where X(t) is an eigenvalue 
of A(t). The authors then set up the usual Fokker-Planck equation for the pdf 
of the solution to Ito’s sde obtained by removing the delay term using the am- 
plitude modulation technique mentioned in my report above. They derive from 
this, the stationary/equilibrium solution (36) for the Fokker-Planck equation. 
I think at this point that some material on bifurcation theory applied to the 
transcendental characteristic equation can be added to explain how when the 
perturbation parameter that governs the strength of the delay term is increased 
gradually from zero, the Lyapunov exponents will change. This analysis can be 
carried out using standard perturbation theory for the roots of nonlinear equa- 
tions. The trouble, however, in using any sort of perturbation theory is to prove 
the convergence of the perturbation series. In (40), the authors write down a 
”Melnikov function”. In order to make the text at this point self-contained, the 
authors may explain here, how the Melnikov theory can be used to determine the 
driving frequency in a system of coupled sde’s. Specifically, if there are periodic 
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forcing terms in an sde, then the frequency of the forcing term can be extracted 
approximately in terms of an integral of some function of the stochastic process. 
I believe that this is the essence of the Melnikov theory. It would be nice to 
compute the mean and variance of the frequency estimate using this integral. 
Further, one can in principle derive maximum likelihood estimators of parame- 
ters of sde’s using stochastic integral representations for the likelihood function. 
How does Melnikov’s method compare with the optimum MLE method ? Some 
light can be shed on this problem. 

Since although the noise is Gaussian, the system is nonlinear, it is follows 
that the system output will be non-Gaussian. Hence, if we move into higher 
orders of perturbation theory and not just the linearized theory, apart from 
the power spectrum, ie, the second order moments, higher order moments and 
spectra would become important. The authors may shed some light on this 
without getting into too much computation. 

In conclusion, I feel that the paper contains many new and interesting results 
and I recommend its publication after some light on the issues mentioned above 
in my report are moderately clarified. The main novelty of the paper appears 
to be to provide a first step towards describing chaos in a dynamical system in 
the presence of stochastic noise. 


[18] Application of the RLS lattice algorithm to the problem of es- 
timating the metric from the world lines of particles in a gravtiational 
field 

Let gyv(x|@) denote the metric of space time which we intend to estimate. 
Here, @ is the unknown parameter vector which is to be estimated. We assume 
that 


P 
Juv (x|0) = Lap y+ ° Ak] dave (x 
k=1 


where w,,,4(a) are known test functions, h,,,(x) is the unperturbed metric and 
the unknown parameters @ are small. The geodesic equations are 


d?a" /dr?+T%,,,, (a)da* /dr)(da™ /dr)+2T%9(2) (dx* /dr)(dt/dr)+T 9 (2) (dt/dr) = 
and 
dt/dr? +12, (da* /dr)(da™ /d tau) + 202, (dx* /dr)(dt/dr) + T89(dt/dr)? = 


Now, we write down the geodesic equations using perturbation theory upto first 
order in the parameters @. It comes out to be of the form 


Pp 
Pa? /dt? = F" (da* /dt,2*) + S$” G"* (de® /dt, x*)0[k] 
k=1 
or equivalently in vector-matrix notation as 


d’r/dt? = f(dr/dt, r,t) + G(dr/dt, r, t)0 
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We can further discretize this equation and express it as nonlinear difference 
equation: 
r[n +2] =a(r[n + I],r[n],n) + BO + l,rin),n)o 


This is a second order vector nonlinear time series model and the parameter 
vector 8 can be estimated via the RLS lattice algorithm based on measurements 
r[n],n = 0,1,...,. Note that the order of this time series model is p which is also 
the number of test function matrices Wyvr(z),k = 1,2,...,p that we use. We 
define the following vector and matrix valued functions of time 


EIN] = ((r[n + 2] — a(r[n + 1], r[n)))) nea, 
X[N, p] = [bi [N], bo[N ’ - bp[N] 


where 
by [N] = (Bg (r[n + 1, r[n],n))) Ly 


Then the above LIP (linear in parameters) model can be expressed in the form 


E[N] = X[N, p]é 


and we can derive a recursive least squares with time and order updates of the 
parameter estimates. 

[19] A remark about the supersymmetric proof of the Atiyah-Singer-Patodi 
theorem. 

In general relativity, the covariant derivative of a vector field is calculated 
using the Christoffel symbols as connection components while on the other hand, 
in non-Abelian quantum field theory, the covariant derivative is calculated in a 
Lie algebraic manner, ie, as the commutator between the connection covariant 
derivative and the vector field expressed as as an element of the Lie algebra. This 
applies also to quantum gravity wherein the non-Abelian connection covariant 
derivative is V, = 0,+T', where I’,,, the spinor connection of the gravitational 
field is expressed in terms of the Clifford algebra generated by the Dirac gamma 
matrices and a vector field B,, is also expressed as a spinor field By, y" = Buely*. 
The commutator of these two objects then defines the covariant derivative of 
the vector field. In a normal coordinate system, we wish to prove that near the 
origin, the latter definition of the covariant derivative as a commutator is the 
natural one to choose from when one computes the Laplacian as the square of the 
Dirac operator and this leads immediately to an expression for this Laplacian 
in terms of the Riemann curvature tensor from which the Atiyah-Singer index 
theorem can be obtained. 


The Dirac operator is 
Where 


Where 


166 


Thus, 


Since 
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{7"(@), 7" (a)} = 29" (a) 


gi” (x) = neh (a)ef (2) 


The spinor-gravitational connection is 


Tn (x) = ly, eX eb 


The Dirac-spinor gravitational covariant derivative is 


Vu =O +T yp 


We get for a covariant vector By, 


Where 


Thus, 


Now, 


[Vu Bay” = [Vier Baesy”| 
= [Ves Bay") = [Ou a Du, Bay"| 
—= Bay - [Pas y"|Ba 


[Byes V7] = [yy wyeds ¥" 


V 
Wed = Ce Edv:u 
v V 

= ~€e.,€dv = —€ev:p€q = ~Wude 


ac, d 


ede = 2n ¥ 


[y°14, 1°] = 2n 


C..y] = Wea y° Fromwhichwededucethat 


[Vu Bl = [Vv ay | = 
Bai Abe Wea YBa 
= 7° (Bay + Wad” Be) 


v v v 
2Wyrab a 2€ 4 Cbv:n = €4€bu:ipm — €b Cavip = 


aan 7 Toth.) 2 CF (Can,u a Placa) 


Upto O(z), it is clear that 


tea Vv 
Caebv,u — €b€av,p = 9 


Since €qy,, is O(a) and further, e% is 6% upto O(x). Thus, upto O(z), we have 


that 


a be Vv 
Q2Wyab = Ly lean, = Cals) 


Ts Vabu _ Doau 


Advanced Probability and Statistics: Remarks and Problems 167 


Since €aa = Naa + O(x). Note that Tas, is O(a) and hence 
Pabp(®) — Vrap(2) = (Pabu,m(0) — Praum(0))2™ + O(a) 
Now, 
Tabym — Veanm = (1/2)(gab,um + Gan,bm — Jbu,am — Jba,um — Jbp,am + Gua,bm) 


= (1/2) (Gap,bm — Gop,am + Gpa,bm — Jop,am) 
Thus, 
QWyab(e) = 


m 


(1/2)(gap,bm — Gop,am + Gpa,bm — Jopam)(0)x 


m 


a (Gay,bm (0) = Jbp.,am(0))x 


This is antisymmetric in (a, b). We wish to show that (gay,bm — Jbu,am+Jpa,bm 
Jou,am)(0) for normal coordinates, is also antisymmetric in (4,m). In fact, for 


normal coordinates, we have 


Vv 
Guve’ = xh 


and hence on differentiating, 
Juv,ph" + Iup = Sup 
Another differentiation gives 
Jpop + Gpv,pak” + Jup,a = 0 
Another differentiation but now at the point x = 0 gives 
9n0,p8(0) + 9n8,ea(0) + Jup.aa (0) = 0 


Thus 
Jap,bm (9) = —(Gum,ab(0) + Iub,am (0) 
and to prove that 
Jau,brr\O) = Oby,ein (0) 


is also antisymmetric in (4,m), we must show that 


Jay.,bm (0) i=? Jbp,am(0) = —Jam,bu + Joma (0) 


or equivalently that 


Jap,bm + Jam,bu — Jbp,am — Jobm,ap = 0 


In view of the above cyclic identity, this amounts to showing that 


—(Jab,my + Jam,pb) + Jam,bu 3: Jbp,am ae Jbm,apu = 0 
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or equivalently, that 


=e vias 7 Jbp,am TF Hinlaje) =0 


which is true again by virtue of the cyclic identity. Thus, in normal coordinates, 
we have that 

Wyab(Z) = Ryuvar(0)x” + O(2”) 
This is the fundamental relation that we are looking for. 


[20] Problems in optimal control of quantum fields 

Historical development of quantum filtering and control: Quantum filering 
and control algorithms in the continuous time case were first introduced by 
V.P.Belavkin and perfected by John Gough and Lec Bouten. The notion of 
non-demolition measurements which do not interfere with future state values 
and which form an Abelian algebra of operators so that one can define in any 
state, the conditional expection of the state at time t given measurements upto 
time t were the creation of Belavkin. He based all his computations on the 
Hudson-Parthasarathy model for the noisy Schrodinger equation and today this 
approach is the standard recognized way to describe filtering in the quantum 
context. The main obstacle in quantum filtering was that non-commutativity of 
observables prevented one from constructing conditional expectations and that 
was completely and satisfactorily resolved by Belavkin by the introduction of 
a special class of measurements, namely non-demolition measurements which 
actually can be shown to be the correct quatnum analogue of the measurement 
process in the classical filtering theory of Kallianpur and Striebel, namely the 
measurement diffrential at time t equals the sum of a function of the current sys- 
tem state plus white measurement noise. Lec Bouten in his PhD thesis showed 
how one can introduce a control term into the Belavkin quantum filter so as to 
minimize the effect of Lindblad noise in some channels. By Lindblad noise, we 
mean the noise terms appearing in the unitary evolution of system plus bath 
in the Hudson-Parthasarathy noisy Schrodinger equation. Later on, other re- 
searchers including Belavkin, John Gough and Lec Bouten introduced quantum 
control in the form of changing the Hamiltonian in the Hudson-Parthasarathy- 
Schrodinger evolution equation in accordance with a desired output and the 
actual non-demolition measurement at time t. For example, in the case of a 
quantum robot, we can alter the Hamiltonian by a torque times angular dis- 
placement term where the torque is proportional to the difference between a 
desired angular momentum and the actual noisy measured angular momentum. 
Always feedback controllers must be in the form of forces/torques proportional 
to the difference between a desired output and the actual output or some differ- 
ential or integral of such a difference or more generally to a linear combination 
of such terms (like the p.i.d controller in classical control theory). It should be 
noted that in the quantum context, the parameter in the Hamiltonian being 
controlled becomes a function of the non-demolition measurement operator. 


[1] Let ps(t), the state of a system evolve according to the GKSL equation 


p(t) = —i[H(u(t)), os(t)] — (1/2)0(ps(t), w(t)) 
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where u(t) is the coherent state parameter and 6 is the Lindblad term with 
the Lindblad operators L; dependent upon the coherent state parameter u(t). 
Specifically, we are assuming that the coherent state |¢(u) > of the bath slowly 
varies with time and we write down the Hudson-Parthasarathy qsde for the 
evolution and then by tracing out over this non-vacuum bath coherent state, 
obtain the dynamics of the system state: 


dU (t) = (—(iH + LL* /2)dt + Ld A(t) — L*dA(t)*)U(t) 


p(t) = U(t)(ps(0) @ |o(u) >< o(u))UH)", 


We have that 


dp(t) = dU (t)p(0)U(t)*+U (t) p(0)dU (t)"+dU (t) p(0)dU(t)*, 


(0) = ps(0)@|G(u) >< o(w)| 

Then, 
Tr2(dU(t)p(0)U(t)*) = 
Tro(Qdt + LdA(t) — L*dA(t)*)U(t)e(0)U (t)*) 
= di.[Q.ps(t) + ult) L-ps(t) — a(t)L* ps(t)] 

where 

Q = —(iH + LL* /2) 
Likewise, 


Tro(U(t)p(0)dU (t)*) = Tro(U (t)p(0)U (t)*(Q*dt + L*dA(t)* — LdA(t))) 


= [ps(t)Q* + W(t)p4(t)L* — u(t)p.(t) Llat 
and finally, 
Tra(dU (t)p(0)daU (t)*) = Tre[L*dA(t)*U(t)p(0)U (t)* LA A(0)] 


= L* p,(t)Ldt 
We thus get 
p(t) = [Q-ps(t) + ult)L-ps(t) — W(t) L*.ps(t) 
+[p.(t)Q* + a(t)p,(t)L* — ult)p,(t)L] + L*p,(t)L 


which is precisely the GKSL equation in a non-vacuum coherent state |(u) >. 
We can represent it as 


Ps (t) = T(ps (t), u(t)) 


where T(., 7) is a linear operator on the Banach space of bounded operators in 
the system Hilbert space parameterized by the complex parameter x. For state 
tracking, we require to control u(t) so that p.(t) tracks a given state pa(t). This 
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may be termed as coherent quantum control. We can also incorporate conserva- 
tion processes in to the GKSL dynamics: The relevant Hudson-Parthasarathy 
noisy Schrodinger equation is 


dU (t) = (Qdt + L1dA(t) — Led A(t)* + LgdA(t))U(t) 
where Q = —(tH + P). We then find that 


Tr2(dU (t)p(0)U(t)") = 


[Qps(t)dt + u(t)Lips(t) — u(t)Lops(t) + |u(t)*Laps(t)]dt 
Tro(p(0)dU(t)") = 
[os(t)Q + U(t)ps(t)Li — u(t)ps(t)L3 + |u(t)/?ps(t)Laldt 


[21] pde based denoising of quantum image fields 

[1] Let the Lindblad operators L;, be functions of q, p where q, p are canonical 
position and operator valued n-dimensional vectors. Thus, q is multiplication 
by the vector x in R” and p is the gradient operator —iV, in R”. The density 
operator in position space representation at time t is defined by the Kernel 
pi(x,y),2,y € R”. Thus, for f € L?(R"), we have that 


pf (e) = i x(a, 9) f(y)ey 


and 


ras (x) = f oenut way 
apef(a) = f wox(e.y)fu)ay, 
pte) =i [ pele. Vyfludy = + f[ (Pypr(e.u))Foddy 
perf(e) =~i f (Wspu(e.y)flw)ay 


and more generally, 
CP rd pf (x) = (—iyht4(—ayter [(V,) rule, yu") Foddy 


which is equivalent to saying that the kernel of the operator q%p’p,q°p% is given 
by 
Ki (x,y) = (—i)""(-1)“a*(Vy)“(pr(@,y)y°) 


an 


an and p® is an abbreviation 


In our notation, x* is an abbreviation for x{*...7 
for pr ...p” or equivalently for 


Obit thn 


\bi+...+0n 
4 2 
iy) Oar? ...Ax°r 
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More generally, if an operator L = L(q,p) is expressed as a function of g, p with 
all the q’s coming to the left and all the p’s to the right, then L(q,p); is an 
opearator having kernel 

L(x, —iV x) pt (x, y) 


while p,L(q,p) has the kernel 


M(iVy, y)pe(2, y) 


where M(u, v) is obtained from L(u,v) by replacing each term of the form u"v* 
in the expansion of L(u,v) by v'u™. More generally, the kernel of the operator 


L1(q,p)prL2(q,p) is given by 
Ly (x, -iVz)M2(iVy,y)pe(2, y) 


where if 


L2(u,v) = x a(r,s)u"v® 


r,s 
then 


Mo(u,v) = x a(r,s)veu" 


r,s 


An example of how to derive a Hamiltonian that reproduces the same effect 
as that of a partial differential operator acting on a quantum field. 
Let 


wr(t, 7) = exp(—t(w(k)t — k.r)) 


and consider the quantum field 


X(t,r) = Do (alk)de (tr) + alk) "de (t,r)) 


k 
where 
[a(k), a)] = 0, [a(k), a(9)"] = ong 
Then 
Opbe(t,r) = w(k)ve(t,r), 
iV rbn(t,r) = kie(t,r) 
Hence, 


w(—iV,)dp(t,r) = w(k)ve(t, 7) 
so that ~, satisfies the pde 


10;WE (t, ie) = w(—-iVr) Pp (t, r) 
Since we assume w(k) to be real valued function, we also get 


—idpw, (t, r) = w(k) pr (é, aN 
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wiVr)oe(t,7) = w(k)be(t,7) 


and hence 
—0? X(t, r) = w(-iV,)?X(t,r) 


provided that we assume that 
w(—k) = w(k) 
so that defining the operator 2-vector field 
Z(t,r) = [X(t,r),10,X (t, r)]7 


we get that 
i0,Z(t, r) = [i0, X(t, r), w(—iV,)*X (t, r)|* 


= ( wiv, 0 ) ee 


This is an example of a quantum field satisfying a vector partial differential 
equation in space-time with the time derivative being only of the first order. 


Let X(t,7) be a quantum image field built out of creation and annihilation 
operators. We wish To denoise this quantum image field. Let Xo(t,7r) denote 
the corresponding denoised quantum image field. We pass the noisy quantum 
image field X(t,r) through a spatio-temporal linear filter Having an impulse 
response H(t,r). The output of this filter is given by 


Xo(t,r) = EC —t'jr—1')X (tr) dt! dr! 


We wish to select the filter H(t,r) so that Xo r) is a close approximation to 
Xo(t,r). in a given quantum state p. This means that we select the function 
H(t,r) so that 


/ Tr(p(Xo(t,r) — Xo(t,r))?)dtd?r 


Is minimal. Setting the variational derivative of this error energy function w.r.t. 
H to zero then gives us the optimal normal equations 


Pr(p | atdr(Xolt r) — Xo(t,r))X(t—t',t—r’)) =0 
or equivalently, 


J PelolXolt.r), X(t, r-r'))} tar 


= f asa utt(s,u) [ Pr(o{xXtt s,r—u), X(t—t’, t'—r')}dtd®r 


Thus, to calculate the filter, we must evaluate the symmetrized quantum corre- 
lations 


Tr(p{ Xo(t, r), X(t, r1)}), Tr(p{ X(t, eh X(t, r1)}) 
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Assuming that p is a quantum Gaussian state so that it is expressible as an 
exponential of a linear-quadratic form in the creation and annihilation operators 
a(k),a(k)*,k = 1,2,..., we express 

the quantum fields Xo, X as polynomial functionals in the a(k),a(k)* and 
computing the quantum correlations then amounts to calculating the multiple 
moments of the creation and annihilation operators in a Gaussian state, ie 
evaluating moments of the form 


Tr(p.I,(a(k)””* )H;.(a(k)*ne)) 
Where 


p = C.exp(—S> a(k)a(k)+a(k)a(k)*— > Bi(k, m)a(k)a(m)+B2(k,m)a(ke)*a(m)* 
: we —Bs(k,m)a(k)*a(m)) 


The easiest way to evaluate these moments is to use the Glauber-Sudarshan 
resolution of the identity in terms of coherent states. 


[22] Definition of the universal enveloping algebra of a Lie algebra: Let 
g be a Lie algebra and let (C,7) be a pair such that (a) C is an associative 
algebra, (b) 7: g > C is a linear mapping satisfying 7([X,Y]) = m(X)a(Y) - 
m(Y)n(X)VX,Y € g , (c) m(g) generates C And (d) if L& is any associative 
algebra and € : g — UW is a linear map satisfying €([X,Y]) = €(X)E(Y) - 
&(Y )€(X)VX,Y € g, then there exists an algebra homomorphism €’ : C > 
such that €’(7(X)) = €(X)VX € g. Then, (C,7) is called a universal enveloping 
algebra of g. Theorem: If (7,,C,),& = 1,2 are two universal enveloping algebras 
of a Lie algebra g, then they are isomorphic in the sense that there exists an 
algebra isomorphism € : C} > Cz such that €(7(X)) = m2(X)VX € g. 


[23] Questions on statistical signal processing 
Attempt any five questions. 
[1] Let X(n),n € Z be any stationary process with 


U(X (n)) = p, Cov(X(n), X(m)) = C(n — m) 


Prove that if 
C(n) > 0,|n| > oo 


then 
N 
limys+o0N7!. S- X(n) =p 
n=1 
in the mean square sense. 


[2] If X(n) is a stationary Gaussian process with mean js and covariance 
C(n —m), then show that 


N 
limn+ooN~*. > X(n)X(n + k) 
n=1 
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converges in the mean square sense to O(k) + uw? as N -+ oo provided that 
C(n) > 0 as |n| > oc. 


[3] Prove the Cesaro theorem: If a(n) is a sequence of complex numbers such 
that a(n) 3 c as n > 00, then N~!. 37*_, a(n) + cas N > 00. 


[4] Define a filtered probability space and a Martingale and a submartin- 
gale in discrete time w.r.t this filtered probability space. Show that if X(n) is 
submartingale, then the maximal inequality holds: 


Pr(mato<n<n|X(n)| > €) < E(|X(N)|)/€ 


Use this result to prove the following version of the strong law of large numbers: 
If X(n) is a sequence of iid random variables with mean pz and variance o?, then 


almost surely, 
N 


limn_sooN71. Sc X(n) = pw 


n=1 


[5] Write short notes on the following: 

(a) The Borel-Cantelli Lemmas. 

(b) Application of the Borel-Cantelli lemmas to proving almost sure conver- 
gence of a sequence of random variables. 

(c) Doob’s optional sampling theorem for Martingales and bounded stop- 
times. 

(d) Asymptotic mean and variance of the periodogram of a stationary zero 
mean Gaussian random process 


[6] Define the Ito stochastic integral for an adapted process w.r.t. Brown- 
ian motion and prove the existence and uniqueness of solutions to a stochastic 
differential equation 


dX (t) = f(X(t))dt + g(X(t))dB(t) 
interpreted as a stochastic integral equation 
t t 
x()- x)= f 1X()as+ [ o(X(s)) ABO) 
when f,g satisfy the Lipshitz condition 
F(X) -— FY) +194) - 9M) < KIX -YWWX,Y ER 


[7] Prove using the Ito calculus that the process 


My(t) = exp(AB(t) — A* B(t)/2) 
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is a Brownian martingale. Now apply Doob’s optional sampling theorem to 
this martingale to determine the probability density of the first hitting time of 
Brownian motion B(t) at a level a > 0 given B(0) = 0. 


[8] Let x[n] be any process. Define the vectors 


Xp = [x[n], a[n — 1], ..., [O]]? ¢ R"*1 
and for r = 1, 2,..., 
z-"x, = [z[n — r],a[n —r — 1],...,2[0],0,...,0]7 € R™*? 
Let 
Ana = Gawlllicitaslpll” 


denote the optimum forward prediction filter of order p at time n, ie, 
P 
ef [n|p] = Xn + se Gp lhle XA = Xn + Xn pan,p 
k=1 


where 
-1 - 1x 
Xn,p = [z Xn 205% ap. E Rt P 


is such that 
Es (nlp) =| er{nlp] ||? 


is a minimum. Likewise, let 


Br-1,p > [bn—1,p[1], tees by —-aelpll” 


be the optimum backward prediction filter of order p at time n — 1, ie, if 


P 
ein plaz Pet Oa plkle eae Pk Kaba 
k=1 
is such that 
Ey(n — 1p) =|) es[r — Up] |? 


is a minimum. Derive time and order recursive formulas for the following: 


er(n|p), eo(n SP 1|p), An,p, bn p 


Also derive time and order projection operator update formulas for the orth- 
gogonal projection 

Pp OG pane es 
onto R(X,,,). Now, if y[n] is another process, consider the joint process pre- 


dictor 


P 
Ynp = oe An plk]z*Xn 
k=l 
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so that 
Il Yn — Fn, |? 
is a minimum. Show that 
Ynp = Pnipyn 
Show that we can write 
p-1 
Ynv =>) Inplkleo[n — 1k] 
k=0 


where 
Gn plk] =< yn, en[n— 1k] >,0<k <p-1 


Derive time an order update formulas for the coefficients gn,,[k]. In the process 
of doing this, you must also derive time and order update formulas for the 
forward prediction error filter transfer function 


the backward prediction error filter transfer function 
P 
Bi(eyao-P ey baile 
k=1 


and the joint process predictor transfer function 


[9] Give all the steps for the construction of the L?-Ito stochastic integral for 
an adapted process f(t) w.r.t Brownian motion B(t) over a finite time interval 
(0, 7]. Also derive the fundamental properties of this integral, namely 


T he T 
rf roan =o.8(f saan? = f° Brea 


[10] Derive the Levinson-Durbin algorithm for the calculating the forward 
and backward predictors of a stationary process x[n] with autocorrelation R[n] 
upto order p. Show that per order iteration, you require only O(p) multiplica- 
tions as compared to O(p?) required if you were to calculate the order p predictor 
directly by solving the relevant optimum normal equations. 
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_ [11] Let X[n] be a stationary Gaussian process. Show that the entropy rate 
H(X) of this process defined by 


A(X) = limy_,.. N~'H(X[N], X[N — 1], ..., X[1]) 
exists and in fact equals 
H(X(0)|X(—-1), X(—2), ...) 


where if U,V are random vectors, then 


H(U) =- f ful fu(u)In(fr(u))du, H(U|V) = =| tov fav (u,¥)in(fuyy (ule) dude 


Hence, prove that 
20 


H(X) = (1/27) i Sx (w)dw 


where S'x(w) is the power spectral density of the process. 


[12] Apply Cramer’s large deviation principle to compute the optimal asymp- 
totic false alarm error probability rate as the number of iid samples tends to oo 
as the Kullback-Leibler/information theoretic distance between the two proba- 
bility distributions. 


[24] Questions on transmission lines and waveguides 

Attempt any five questions. 

[1] Calculate the capacitance per unit length between two parallel transmis- 
sion lines of cylindrical shape having radii a,b with a separation of d between 
their axes. Use the theory of functions of a complex variable to make this 
computation. 

Super-directivity of a quantum antenna: Suppose that the electron-positron 
field with wave operator field w(t,r). The four current density is then given by 


J*(x) = —ep(2)*alp(2x), a! = 77" 
And the four vector potential generated by this four current density is then 
given by 
AM (eg) = [meee — 2')d*a! 
Where 
G(x) = (u/4m)5(a*) = (u/4er)5(t — |r|/e) 


Is the causal Green’s function for the wave operator. Now using the above 
formula for the vector potential, the far field four potential has the form 


A’(t,r) = (u/anr) f(t —rfet+fr' fer’) r' 


178 Advanced Probability and Statistics: Remarks and Problems 


It follows that as a function of frequency, the far field four potential has the 
angular amplitude pattern 


BY (w,r) = f I (o.0 empl gh rr 


To evaluate the directional properties of the corresponding power pattern, we 
must first choose a state |7 > for the electron-positron system and compute 


SEY (wT) =< 7|BY(w, 7). BY (w,r)"|n >= 


/ < nl JH (w, 171) J” (w, r2)*|n > exp(jkF.(ry — r2))d?r1d? ro 


In order to obtain superdirectional properties of the radiated field, we must 
prepare the state |7 > so that the above quantum average is large when p = v. 
First observe that in terms of the creation and annihilation operators of the 
electron-positron field, the Dirac wave operator field is given by 


w(t,r) = / ul Peg alo cap ACEP P + 
v(P, a)b(P, o)*exp(i(E(P)t—P.r))|d?P 
Where E(P) = Vm? + P?. We then find that the temporal Fourier transform 


of The four current density J“(t,r) = —ed(t,r)*a"w(t,r) is given by the con- 
volution 


I(w,r) = (6/2) flu! — 0, rate rd 
R 
Where w(w,r), the temporal Fourier transform of 7)(t,7) is given by 


w(w,r) = (2r)~* flue. a)a(P, c)exp(iP.r)d(w—E(P))+ 
v(P, c)b(P, o)* exp(—iP.r)5(w+E(P))]d?P 


In our CDRA case, we have to modify this formula slightly. The possible fre- 
quencies of the Dirac field are not a continuum E(P),P € R® but rather a 
discrete set w(mnp) = E(P(mnp)) and at a given oscillation frequency w(mnp), 
the Dirac field contributes an amount 


Umnp(w,7) = xi(mnp, r)d(w — w(mnp))a(mnp) 


If we consider the corresponding negative frequency terms also (ie, radiation 
from both electrons and positrons), then the result is 


Vmnp(w,r) = x1(mnp, r)d(w—w(mnp) )a(mnp)+x2(mnp, r)d(w+w(mnp))b(mnp)* 
The result of performing the above convolution is then 


J#(w,r) = x1(mnp,r) 
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Periodogram is an saa eee of the power spectrum 
N™| s X(n)exp(—jwn)|? 


X(n) is a stationary zero mean ee process. Then, if R(n) is absolutely 
summable, 


E[Sn(w)] > S(w), N > co 


2( Sty (w1)$y (wa) 
N-1 
= N-? s: U(X (n).X (m)X (k)X (1) exp(—juy (n—m)—jwe(k—l)) 
nmkl=0 
Now, 


1(X(n)X (m)X (k)X(l)) = R(n—m)R(k-l) + R(n—k) R(m—l)+-R(n—1)R(m—k) 


The first term based on this decomposition converges to S(w1)S(w2). The second 
term is 


N-1 
N? SY) R(n— k)R(m — Neap(—Jjur(n — m) — jwa(k — 0) 
nmkl=0 
asic b)exp(—jui(k + a —1 — b) — jwo(k — 1) 
= N~ °S > R(a)R b)exp(—j(w1 + we)(k — 1)).exp(—ju (a — b)) 


with the summation range of the indices being 


0<k,latk,b+l1<N-1, 

or equivalently, 
lal, |b] < N—1, max(0,—a) < k < min(N-1, N—1-a), 
max(0,—b) <1 < min(N-1, N—1-6) 


It is easy to see that this term can be expressed as 


N-1 
[ Ne R(a)exp(—j(wi+we2)(N—a)/2)exp(—jwya).sin(w(N—|a|—1)/2)/N.sin(w/2)] 
Soe x[a << —-— > bw < —-— > wy] 
where [a < — — —b >,w, < — — — > wy] denotes the same as the previous 


but with the indicated interchanges. This term evidently converges to zero as 
N — o for positive w,, omegag. The third and last term evaluates to 


N-1 


N-1 
ys, R(n — lexp(—j (win — wel)) ee, R(m — k).exp(j(wim — wek)) 
n,l=0 m,k=0 
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[25] Further topics in statistical signal processing 
[1] Generalize the Levinson-Durbin algorithm for vector valued AR wide 
sense stationary processes defined by 


X(t) =— s A,(k) X(t — k) + W(t) 
k=1 


where X(t) € R™ and A(1),..., A(p) € R“@*™. Show that the optimal normal 
equations are assuming that W(t) is iid N(0, J) are given by 


Rik] + S> Aplm|R[k — m] = E[pld[A], k = 0,1,...,p 


where 


Rik] = E(X(t)X(t—k)7) © RMX 


Use the block Toeplitz and block centro-symmetry properties of the autocorre- 
lation matrix 


((R[k — m]))1<km<p € RMP*MP 
to obtain order recursive solutions for the matrix coefficients A,[k],k = 1, 2,...,p. 
Let 
B,|k] = Ap|[p +1—k],k = 1,2,...,p 


Then, let 
0 00. O In 
iS O 0 0.242 Gaye On. | eee 

tiv 0 0 0 0 

Let 
Sp = ((Rlm — k]))1<k,m<p] € RMP*P 
Then ; 
Ip Spily = Spvda = Line 
where - 
Sp = ((R[k — m]))i<km<p 

Then, let 


Ap = [Im, Ap[1], Ap[2], ..., Ap[p]] € R!MxM(pt+1) 


Then, the optimal normal equations can be expressed as 
ApSp+1 = [E[p], 0, ..-, 0] 


Then, 


Ap IJp41Ip41Sp+i psi = [Ep], 0, ..., 0] Jp4i 


or equivalently, 
By Sp+1 = (0, 0, ar) E(p)] 
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where 
By = [Ap[p], Ap[p =), vy Ap[1], Ina] 
Thus, 
B,[1 : p|Sp + R[p: 1] =0 
Note that 


B,[1 : p] = Ap[1 : pl Jp 
On the other hand, we have that 
Ap+15p42 = [E[pt+ 1],0,...,0] 
or equivalently, defining 
Apsalt p+ 1] = [Apsall],-, Ap+al + Ul, 
we deduce that 


Ap+ill : p|Sp41 + Apsilp + 1]R[—p — 1: -1) = [E[p + 1],0,...,0] 


where 
AlapSle=l) = [al=p- 1) Raph Ad] 
Also, 
Ril: p+1)+ Apzill: p+ lSp41 = 0 
Note that 
Ap+1 = [T, Ap+i[]], .--, Aptilp + 1] =, Apsill : p+ 1] 
and hence 


Ap+i[l : p|Sp + Apsilp + 1]R[—p : -1] + R[1: p] =0 


On the other hand, for an arbitrary M x M matrix K,+1, we have from the 
above p” order equations, 


Ril: p| + Ap[1: p|Sp = 0 
We define B,[1 : p] to be the solution to 
R[-1: —p|Jp + B,[1 : p]Sp = 0, 


or equivalently, 7 
R[-p : -1] + B,[l : pS, = 0 


Likewise, we define A,[1 : p] to be the solution to 
R[-1:—p]+ A,[1: p]S, = 0 


Note that 


B,[1 pl = A,|1 : pip 
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Then, 
(Ap[1 : p] — Kp41 Bp[1 : p])Sp = —R[L: p] + Kpyi R[-1: —p]Jp 
It follows by comparing the above equations that if we take K,+1 so that 
Kp4iR[-1: —p|Jp = —Apyilp + 1 R[-p : -]] 
then we are assured that 
Ap+i[L: p] = Ap[1: p] — Kp41 Bp[1 : p] 


However, noting that 


the condition reduces to simply 
Koy1 = —Apyilpt 
Further, we have defined 


By = ApJp4i = [App], ---, Ap[1], Inc] = [Bp[1 : vp], Ina] 


so that 7 
BySp4i = (0, 0, eo) E|p]] 


and this gives us : 
B,[1: p]Sp + R[p: 1] =0 


and hence 
(Bp[1: p] — Kpy1A,[1 : pl) Sp = —(Rip : 1] — p41 R[-1: -p)) 
On the other hand, we have that 
Bo11Sp42 = (0, ...,0, Elp + 1] 
which implies that 
Boyill :p+ 1 Spu1 + R[pt+1: 1] =0 
or equivalently, 
Bpti 2: p+ Sp + Bpsi [1 R[-1 : —p] = —Rfp: 1] 
Thus, if we take 
Koay = —Bpsill] =—Apuilp + 1] = Kp 
then we would get 


Bp+1[2 p+ l= B,[1 pl Kp41Ap[1 : p| 
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We also have 


R[-1:—p]+ A,[1: pS, =0 
so that : > 
R[-1: —p—1]) + Apyill : pt YSp41 = 0 


which gives 
Apsall : p]Sp + Appalp + 1R[p: 1] = —R[-1: -p| 


and further, 


(A,[1 | = Ly418B,[1 : p)) Do Bll =p sb Ly4iR[p ail 


so that if we define 


then we would get 


Ap+ill : p] = Ap[l : p] — Kp41B,[1 : p] 


[26] Reduce Dirac’s relativistic wave equation in a radial potential to the 
the problem of solving two coupled ordinary second order linear differential 
equations in a single radial variable r. Solve these by the power series method. 
Now add quantum noise in the form of superposition of the derivatives of the 
creation and annihilation processes for the vector potential and superposition of 
the creation and annihilation processes for the scalar potential (start with the 
noisy vector potential expressed in terms of creation and annihilation process 
derivatives and apply the Lorentz gauge condition to derive the expression for 
the noisy scalar potential). Then incorporate quantum Ito’s correction terms in 
the Hudson-Parthasarathy noisy Dirac equation and assuming the bath to be 
in a coherent state, obtain formulas upto O(e?) for the transition probability of 
the bound Dirac electron subject to quantum noise from one stationary state to 
another. 


[27] Notion of group algebra for a finite group and a representation of the 
group algebra in terms of a representation of the finite group. 


[28] Write down the Lie algebra representations of the standard generators 
of SO(3) acting on differentiable functions defined on R® and show that these 
generartors are precisely the angular momentum operators in quantum mechan- 
ics, ie, the x,y, z components of the operators —ir x V. 


[29] The standard generators of the Lie algebra si(2,C) are denoted by 
X,Y,H. They satisfy the commutation relations 


[H, X] =2X,[H,Y] =-2Y,[X,Y] =H 
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From these commutation relations, derive all the finite dimensional irreducible 
representations of si(2,C) and hence of SLZ(2,C) and SU(2,C). Show that 
SL(2,C) is the complexification of SU(2,C) and also of SZ(2,R). These are 
upto isomorphism, the only two non-conjugate real forms of SLZ(2,C). The 
former is a compact real form while the latter is a non-compact real form. 


[30] Some problems in group representation theory. 

[1] Define the Weyl group for a semisimple Lie algebra. First define and 
prove the existence of a Cartan subalgebra, ie, a maximal Abelian algebra and 
show that all its elements in the adjoint representation are semi-simple. This 
is true only for semisimple Lie algebras. The Weyl group is defined as the 
normailizer of the Cartan subalgebra. Show that the Weyl group is generated 
by the set of reflections corresponding to simple roots w.r.t. any positive system 
of roots. Show that the Weyl group acts dually on the set of roots by permuting 
them. Give examples of complex simple Lie algebras, their Cartan subalgebras 
and the associated Weyl groups. Consider sl(n,C), so(n,C), sp(n,C). Use a 
convenient basis for these simple Lie algebras for constructing the root vectors 
and the Cartan subalgebras. Show that every complex semisimple Lie algebra 
has upto to conjugacy equivalence just one Cartan subalgebra. Show by taking 
the example of si(2,R), that real semisimple Lie algebras can have more than 
one non-conjugate Cartan subalgebra. 


[2] Show that given a representation 7 of a semisimple Lie algebra g, and 
a vector v in the vector space V in which the representation 7 acts, suppose 
v is a cyclic vector and that 1(H)v = A(H)vVH € 4h, ie, v € Vy for some 
A € h* And 7(X;)vu = 0,7 = 1,2,...,1, then if N_ is the subalgebra of the 
universal enveloping algebra 6 of g. then V = 7(9t_)v. Hence deduce that 7 is 
a representation with weights, ie, we can write 


V=Qv, 


Heb 
Where 
V, = {w eV: 7(A)w = p(A)wVe € 6} 
Show that 
dimVy =1 
ie 
V=C. 


Show further that if 7 is an irreducible representation, and if w € V is any 
vector such that 7(X,)w =0,1<k<l, then w€ Vy. 


[3] Let 7 be a representation of a Lie algebra g in V and let v be a non-zero 
vector in V. Let 6 denote the universal enveloping algebra of g and define 


M= {ae G:n(a)v = 0} 
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Show that 9 is an ideal in 6. Define a map 
T : 6/mathfrakM > V 


By 
T(a+M) =7(a)v,aceG 


Show that T is a well defined linear map between vector spaces and that T 
is one-one. Let 2, be an ideal in 6 containing SW. Then, show that W = 
T (IN, /Mt) consists of all elements 7(a)v,a € Mt, in V and is therefore a 7- 
invariant subspace of V. Hence deduce that if 7 is an irreducible representation, 
then T is a bijection and 9%, is either Mt or G, ie, in other words, IM is a 
maximal ideal in 6. Conversely, show that If 7 is an arbitrary representation 
and 2 as defined above is a maximal ideal of 6 with v as a m-cyclic vector then 
m is an irreducible representation. In fact, show that if v is a-cyclic for any 
representation 7, then T is a bijection and the mapping 


MN, /MN —> T (9) 


Is a bijection between the set of all maximal ideals 9%, of 6 containing Mt and 
the set of all a-invariant subspaces of V. 


Give examples of infinite dimensional Lie algebras having finite dimensional 
non-trivial representations. Show that if G is a simply connected Lie group and 
H is a discrete normal subgroup, then the Group 71(G/H) is isomorphic to 
H. Hint: First show that if y : [0,1] + G/H is any continuous curve and if 
p:G-— G/H is the canonical projection, then there exists a unique continuous 
curve ¥ : [0,1] > G such that poy = y. For this, you must make use of the 
discreteness of H. Now let 7 : [0,1] > G/H be a closed curve starting at H. 
Then, define M(7) = 7(1). Prove that if C is the equivalence class of all Closed 
curves in G/H starting at H and with composition defined in the usual way 
that one defines in homotopy, ie y,072(t) = ¥1(2t),0 < t < 1/2 and = y2(2t—1) 
for 1/2 <t <1, then C becomes a group and that M is an isomorphism from 
C to H. To show that, you must make use of the simple connectedness of G. 
Indeed, let y € C and let ¥ : [0,1] > G be the uniquely defined continuous curve 
as above. Then, if 7(1) = e, it follows from the simple connectedness of G that 
4 is homotopic to the constant curve ¥o(t) = e,0 < t < 1 from which it follows 
that y = poy is homotopic to yo = poyo. This proves the injectivity of M and 
the surjectivity of MM is obvious. Note the way in which we make use of the 
normality of H in G: Only because H is normal, it follows that G/H is a group 
owing to which, if we define the composition of two curves 7; and 7¥2 in G both 
starting at e by the rule ¥ = 4,072, where 7(t) = 71(t) For 0 < t < 1/2 and 
F(t) = ¥(1)oFe(2t — 1),1/2 <t < 1, then it follows that 


p(F10F2) = p(j1)op(F2) 
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[31] New topics in plasmonic antennas 

[1] Quantum Boltzmann equation, Version 1: We have a creation annihilation 
operator field in three momentum space a(K),a(K)*, K € R%. These satisfy the 
canonical commutation relations (CCR): 


[a(K), a(K’)*] = 0°(K — K’) 


Let p denote the density matrix of the system (at time t = 0). Under the 
Heisenberg matrix mechanics, p remains a constant while a(K),a(K)* evolve 
with time to a(K,t),a(K,t)*. Let H denote the Hamiltonian of the system. 
Then, 

a(K,t) = exp(itH )a(K).exp(—itH) 


a(K,t)* = exp(itH)a(K)* .exp(—itH) 


Alternately, if we adopt the Schrodinger wave mechanics picture, the operators 
a(k),a(K)* remain constant with time while the density p evolves with time 
to 

p(t) = exp(—itH)p.exp(itH) 


We assume that p = exp(—GH)/Z(G), Z(G) = Tr(exp(—8H)), ie, p is the 
canonical Gibbs density. Then, p remains constant with time under the Schrodinger 
picture. Define the Wigner-distribution function 

f(r, K,t) =Tr(p(t).a(K + r/r)a(Kk —r/r)*) 
We assume that 
H= feraccyacarx [ h( Ky, K2)a(K1)*a(Ky)a(K2)*a(K2)d? Kid? Ke 


and we take p(0) as the Gaussian state 


p(0) = err(—8 f gl K)a(K)*a(K)K)/Z(8) 
We calculate 
Of (r, K,t)/0t = Tr(—i[H, p(t)]a(K + r/r)a(k — r/r)*) 
= iT r(p(t)[H, aK + r/r)a(Kk — r/r)*]) 
Now, 
(H,a(K+r/r)a(K—r/r)*| = [H,a(K+r/7))a(Kk—-r/r)*+a(K+r/r7) (A, a(k—-r/7)*] 


Now 


9 


[H,a(K + r/r)] = fo Val") aCK"), aC + fr] K" 


+ | h(, Ka)la(Ka)*a( Ki )alKa)*a Ka), aK +r/t)|d Kid Ke 
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Now, 
[a(K’)*a(K"), a(K+r/r)| = [a(K’)*, a(K+r/7)]a(K") = —6°9(K—K'+r/r)a(K’), 
[a(K1)*a(ky )a(K2)"a(Ke), ak + r/r)] = 
[a(k1)", a( Kk +r/7)\a(iy )a( 2)" a( Kg) +a(ly "ai )[a( Ko)", a +r/7)|a(K2) 
= —63(K —K,+1r/r)a(K1)a(K2)*a(K2) — 6° (K — Ko +r/7)a(K1)*a(K)a(Ko) 
Thus, 
[H,a(K + r/r)] = = [ oxox —K'+ r/t)a(K’)d? Kk! 
= [ns K)(5(K Ky br | yal Kaa)" aK) 
—0°(K—Ko+r/7)a(K1)*a(K,)a(K2))d?K,d? Ky 


=-g(K +r/r)a(k +r/r)—a(k + rt) f mK +1r/t, Kz))a(K2)*a(K2)d? Ko 
-(f h(K1, K +r/r)a(K1)*a(K1)@K)a(K + 1/7) 
—g(K +r/r)a(k +r/r) — {a(k + rir), f ox +r/t, K')a(K'\*a( KK} 
Note that without any loss of generality, we are assuming that 
h( Ky, Ko) = h(Ko, Ky) 


[32] Hartree-Fock equations for an N-electron atom 
The Hamiltonian is given by 


N 
H=)>Het+ >> Va 
k=1 


1<k<j<N 
where 
Hon = P2/2m +4 Vor 
with 
Vor = Vo(re) + (ge/2m) (cx, Br) 
where 


Vo(rz) = —Ze? /Arery, 


and By, is the magnetic field produced by the nucleus at the site of the k*” 
electron owing to the its relative motion. It is given by 


By = —pe(vg x re) /4arz = —e(Pp x rp) /4amr3 


= eL,/4amr3 
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where 
Dy =p xX Ph =—iry, X Ve 


is the angular momentum of the k“” electron. Thus, 
(ge/2m)(on%, Be) = ge? (ox, Lp) /8m?r3 


This is also called the spin-orbit interaction. Further, the interaction between 
the k'” and the j*” electron for k 4 j is given by 


Vij =e? /Arerp; + ge(on, Buz) /2m 


where B,; is the magnetic field at the site of the k*” electron produced by the 
j*” electron. This magnetic field has two components. One, the magnetic field 
produced by the spin of the j*” electron and two, the magnetic field produced 
by the relative motion between the two electrons. This first component is given 
by 

BY = curl(um, x Tj [ANT 25); = geo;/2m 


and the second component is given by 
Bo = —jie(P, —.P;) x Tg [Arp 


An alternate way to describe this interaction is to consider the spin-orbital 
magnetic moment of the j*” electron: 


Mj = e(L; + goj)/2m 
and to consider the magnetic field 
curl ((t1/4m).M; x rej /Tb5) 


produced by this total magnetic moment at the site of the k‘” electron. Yet 
another model for this interaction is to consider the total electromagnetic field 
produced by the j*” electron: 


Ej (rrp) = —erps /Amerg,, 


B; (rp) = curl((y/40)M; x raj /Tey) 


at the site of the k*” electron and then to determine this electro-magnetic field 
at the rest frame of the k’” electron: 


Ej (rr) = Ej (re) + (Pk/m) x Bj (re), 


B; (rr) = By(re) 
provided that we neglect relativistic effects. Taking all these effects into account, 
we can express the total Hamiltonian of the system of N electrons in the form 


H = 5 -(Pe/2m — Ze? /Amerg) + S> fa(re)(on; Le) 
k k 
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+ $2 folreg (on, 05) + >> fa(reg)(Le, Ly) 
j<k j<k 
+5 falrey)(Le, 03) 
j<k 
To simplify our calculations, we shall specialize these results to a two electron 


atom, as for example the Helium atom. The total Hamiltonian of this system 
of two electrons is given by 


A = Hoi + Ao2 + Via 
where 
Ao = P? /2m— Ze? /r1+filri)(o1, L1), Ho2 = P3 /2m— Ze? /ro+ fire) (2, L2) 


Vig = e” /Anery2 + (e/2m)(Bi(r2), Lo + go2) 
where 
By(r2) = ((ue/4amrfy)(P2 — Pi) x 121) 
+eurls((u/4trfz)(geo1/2m) x 21) 


Another way to calculate the interaction between the spin and orbital magnetic 
moments of the two electrons is to calculate the total magnetic field produced 
by both the electrons and to integrate its square over the whole of space. The 
resulting energy is then equal to (1/2) { |Bi(r)+Be(r)|?d?r and the interaction 
component of this energy equals (1/j) f(Bi(r), Bo(r))d°r which is proportional 
to a term of the form f(riz)(L1 + goi, L2 + go2) because the magnetic field 
produced by the first electron is 


By(r) = curl(u/4n|r — r1|°)My x (r —11)), Mi = e(Li + gor) /2m 
and likewise for Bo(r). Thus, 
Vio = e? /4meri2 + fo(|ri2|)(Li + gor, Lo + go2) 
which is of the above form. Thus, we write 
H = (P?/2m — Ze? /r1 + filri)(o1, L1)) + (P2/r2 — Ze?/r2 + filri)(o2, L2)) 


+(e? /4meri2) + fo(|ri2|)(Li + go1, L2 + gor) 


In order to apply the variational principle for computing the approximate wave 
functions of the two electrons, we first note that the test wave function must 
be antisymmetric with respect to interchange of the position-momentum-spin 
indices in accordance with the Pauli exclusion principle. Thus we may try wave 
functions of the form 


(12) = Ci(¢1(r1)¢2(r2) — b1(72)2(71))| + + > 
(12) = C2(¢1(r1)¢2(r2) — b1(72)G2(71))| -— -— > 


190 Advanced Probability and Statistics: Remarks and Problems 


C3(b1(11)b2(r2) + b1(r2)2(71))(| + — > -|-+ >) 


where ¢; and ¢2 are normalized position space wave functions and Cj, C2, C3 
are normalization constants given by 


1/CZ = 2(1— < 1, ¢2 >) = 1/C3, 
1/C3 = 4(1+ < 91, ¢2 >) 
assuming the wave functions to be real. 

[33] Quantum Boltzmann equation for indistinguishable particles 
in a system in the presence of an external quantum electromagnetic 
field 

p(t) is the state of the system and bath. The total Hamiltonian of the system 


and bath has the form 


where H, is the system Hamiltonian, Hg the bath field Hamiltonian and Vs z is 
the interaction Hamiltonian between the system and bath. The system Hilbert 
space is 


where H.,, is the Hilbert space of the n“” particle in the system and these Hilbert 
spaces are identical copies. The bath Hilbert space is Hz. H, acts in the system 
Hilbert space and is therefore to be identified as H, ® Ip. Hp acts in the bath 
Hilbert space and is therefore to be identified with J, ® Hg while Vz acts in 
H,® Hg and can therefore be expressed as 


Van = 5 Vex ® Vax 
ke 


where Vx, acts in H, while Vg, acts in Hg. The system Hamiltonian has the 


form 
N 
2 ea 
n=1 1<n<k<N 


where H,,, acts in H,,, the n“” particle Hilbert space and V,,, acts in Hy, @ Hx. 
We assume that H,,,,n = 1,2,...,.N are identical copies and so are Viz, 1 <n < 
k < N. The system state at time ¢ is given by 


ps(t) = Tra (p(t)) 


while the bath state at time t is given by 


ppt) = Trs(p(t)) 


The Schrodinger-Liouville-Von-Neumann equation for the density operator is 


i.rho'(t) = [H, p(t)] = [H,, e(t)] + [Hp, o(t)] + Von, ol) 
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and taking the partial trace, we get 
ips (t) = [Hs, Ps (t)] + Trp|VsB, p(t)], 


ipa (t) = [Hp, pa(t)] + Trs[Vep, p(t)] 


In order to isolate the system dynamics from the bath dynamice we make an 
approximation 


p(t) © ps(t) @ pa(t) 


while calculating the term Trp[Vsz, p(t)] (Since V;p is already assumed to be 
of the first order of smallness, this assumption amounts to neglecting second 
order of smallness terms). Thus, we get for the system state dynamics the 
approximate equation 


ip.(t) = [Hs, s(t)] + 9 Tr(pn(t)Vex) (Vou, Ps (t)] 
k 


Writing 
Tr(pp(t)Vax) = ax(t), Vs(t) = >) an(t)Vse, 
k 


we can write 
ip’,(t) = [Hs, ps(t)] + [Va(t), ps(t)] 


where all operators here are now system space operators. We then assume that 
the particles within the system are all indistinguishble and hence the partial 
traces of p(t) over all the His except H,, are identical copies for the different 
n's. We then get on taking partial traces: 


ipsa (t) = [Hs1, psi (t)] + (N — 1)Tra[Vi2, psia(t)] + Tres...w[Ve(4), ps (t)] 
ipsjo(t) = [Hsit+Hs2+Vie2, psi2(t)]+(N—-2)Tr3[Vist Vas, ps123(t)] +7 rsa... [Vs(t), ps(t)] 


These are the basic equations from which the quantum Boltzmann equation is 
derived making further approximations. We make further approximations: 


Tr3...N[Vs(t), ps(t)] © Tra3...w[Vs(t), Ps1(t) ® -..Psn(t)] 
and 
N 
V(t) © > Vex (t) 
k=1 
where V,,(t) acts in Hy, and for different k's, these are identical copies. Then, 
Tro3...n[Vs(t), Psi(t) ® ...psn(t)] © 


[Vs (t), ps1 (t)] 


The logic in this latter approximation is that the bath field acts in an additive 
way separtely on each of the system particles and hence can be regarded as an 
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external time varying potential in which each of the system particles moves. 
Thus, we may write 
Hi (t) = As) ot ei (t) 


and this Hamiltonian acts in ,, the Hilbert space of the first particle in the 
system. Further, we approximate T1r3[Vi3 + Vo3, 0s123(t)] by 


Tr3[Vi3 + Vo3, pi2(t) ® p3(t)] 


which is obtained on writing 


Psi23(t) © psi2(t) @ pa(t) + gi23(t) 


where gi23(t) is of the first order of smallness, and neglecting second order of 
smallness terms. Further we write, 


piz(t) = pilt) @ pr(t) + gia(t) 


where gj2(t) is of the first order of smallness. Then neglecting second order of 
smallness terms, we have 


Tr3[Vi3 + Vo3, P123(t)] 


= Tr3[Visz + Vos, pi(t) @ pa(t) @ ps(t)] 
= [Wi (t) + Wa(E), pi(t) ® p2(t)] 


where 
W(t) = Trs[Vis, ps(t)], W(t) = Trs[Voa, ps(t)] 


Note that W(t) and W(t) are identical copies of each other acting in the 
Hilbert spaces H, and Hy» respectively. Since we are neglecting second order of 
smallness terms, we can substitute for p3(t) the expression 


p3(t) = exp(—it.ad(H,3)(p3(0)) 


in the above expressions for W\(t), W2(t). If we are interested in calculating 
psi(t) upto the first order of smallness terms, then we would use the simplified 
quantum Boltzmann equation 


ips (t) = [Ha(t), pss (t)] + (N — 1)[Via, psi (t) ® por (t)] 


If however, we are interested in calculating p,1(t) upto the second order of 
smallness terms, then we would use 


ipsi(t) = [Hei(t), psi(t)] + (N — 1)[Viz, psi(t) @ psi(t) + gi2(d)] 


where by first order perturbation theory applied to the above equation satisfied 
by psia(t), we have 


igio(t) = [Hs1 + Hs2, gi2(t)] + (Via, ps1(t) @ psi(t)] 
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+(N mo 2) [Wi (t) a W(t), psi(t) ® psi(t)| 


[34] Some problems in fluid dynamics related to large deviation 
theory 

[1] Consider an incompressible fluid described by the following equations for 
the velocity field: 


Va (t, ©, Y)Ve,a(t,£,y)+vy(t, 2, y)vz,y(t, 2, y)+z,e(t, 2, y) 
= —Px(t,2,y)+n(Ve,00(t, 2, y)+x,yy(t, 2, y))+Ve fe (t, x,y) 
Va(t, L, Y)Vy,a(t,L, y)+Vy(t, 2, y)Yy,y(t, L, y)+Vy,t&, 2, y) 
= —Py(t, 2, y)+n (ey, c(t, 2, y)+y,yy(t, @, y))+Vefy(t, 2,y), 
Uz,2(t,£,Y) + Vy y(t, x,y) =0 


where (fz (t, 7, y), fy(t, 2, y)) is a random forcing field. Calculate the rate func- 
tion for the velocity and pressure fields over a given region of space-time assum- 
ing that the forcing field is a Gaussian field with specified mean and covariance. 


[2] Let X,, X2,.. be iid random variables and define 


Calculate the limit of the distribution 
Pr(X, € dz|S;,/n =m) 


as n — oo. Make use of Cramer’s theorem on large deviations. 
hint: Let I(a) denote the rate function of {S,/n:n > 1}. Then, we have 
approximately for large n, 


Pr(X, € dz|S;,/n =m) = P(X, € dz, Xo +...+ Xn = nm — £)/P(S;,/n =m) 


= P(X, € dx).P((Xo+...+Xn)/(n—1) =(nm—2)/(n—1))/P(S,/n =m) 
= P(X; € de).exp(—(n— 1)1((nm — «)/(n — 1) /exp(—nI(m)) 
= P(X; € de).exp(—(n—1)(I(m) + I'(m)e) /exp(—nI(m)) 
= P(X, € dx).exp(I'(m)x + I(m)) 


Historical remark: Guo, Papanicolau, Kipnis, Olla and Varadhan introduced 
and averaging method which enables one to obtain hydrodynamic scaling limits 
for the interacting diffusion and simple exclusion models when the number of 
particles tends to infinity and the lattice on which the particles are present 
converges to the continuum. 


194 Advanced Probability and Statistics: Remarks and Problems 


Let (x) be a simple exclusion process on Z. It satisfies the stochastic 
differential equations 


dne(x) = So Ine(y)(1 = ne(w))dNe(y, 2) — me(e)(1 — me(y))aNi (a, y)] 
yA 


where {N; (x,y) : t > O}24y are independent Poisson processes with means of 
i(Ni(x, y)) = p(x, y)t 
Thus, the generator of the 7; process is given by 


“[0(m+at)|m] = O(m) + dt.Lo(n) 


where for 7: Z — {0,1}, 


Lo(n) = >> pla, y)n(a)(1 = ny) (on) = e(n)) 


wAy 


Let us assume that 
p(x, y) = p(y — 2) 
Then, we get 


L9(n) = >) rlz)n(@)(1 — n(@ + 2))(o(n@**) — o(n)) 


Define the empirical distribution of the particles at time t by 
pnt =N* 30 mlx) dapn 
x 


where the sum is over x € Zy = {0,1,...,N — 1}. Note that uy is a random 
measure on Zy. By identifying « € Zy with x/N = 6 € [0,1], we can regard 
}n,t aS a random measure on [0,1]. Then, for any function J: [0,1] > R, we 
have that 


; J(B) dyer e(8) = (1/N) S> I(ae/N new) = Ext) 


say. Then, 


d&j,n(t) = (1/N) S> J(@/N)dn(a) = 


x 


(1/N) So J(@/N) (my) — me(2))dNi(y, ©) — (x) (1 — m(y)) Ni (a, 9)) 
uAy 


= (1/N) S$ F(a/N)(m(y) (1. — m(a))p(a — y) — n(x) (1 — m(y))ply — 2) dt 
uFY 
+dM jn (t) 
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= dt(1/N) S°(J(@/N) — J(y/N))p(@ — y)ne(y) (1 — me(@)) + dM w(t) 
am 


where M7, (t) is a Martingale defined by 


dM y,n(t) = (1/N) S© J (x/N)(m(y) me (@)) dM (y, ©) —re (2) (1—ne(y)) di (a, y)) 
vAy 


= (1/N) $°(J(x/N) — J(y/N))ne(y) (1 — n(x) dM (y, 2) 
xFy 
where 
Mi(y, x) = Ney, x) — p(x —y)t,z Ay 


are independent Martingales. We note that 


3[(dMy,n(t))?] = (dt/N?) 90 (J (@/N)—J(y/N))?E((m(y)(1—m(2)))?p(a—y)? at 
vFy 


which is easily seen to tend to zero as N — oo. In fact, it is bounded above by 


(dt/N?)(supocyo,yJ'(9)) D> (w—y)?p(@ — y)/N? 


@,yeZn 


= (dt/N*) (supecio,rJ'(9)") ¥- 2?p(2) 
which will converge to zero in the special case when p(z) has finite range or 
more generally even when >>, p(z) is finite which is then always the case since 
p(z) equals A times the probability that a particle will jump from x to x + z 
where A is the rate of the Poisson clock. Note that 


DS 2°p(z) < N? wr) =AN? 


zEZLn 


Thus, with neglect of the martingale terms, we find that 


dé sn (t) = at(1/N) $7 (J(@/N) — J(y/N))p(a — y)m(y)(1 — (2) 


u=y 


= (dt/N) S0(J'(y/N)z/N + J" (y/N)2?/2N*)p(z)ne(y) (1 — ne(y + 2)) 
YZ 
with neglect of terms which converge to zero as N — oo. For symmetric dy- 
namics (p(z) = p(—z)), 5, zp(z) = 0 and if we assume that S>, 2*p(z) = DN?, 
then we can prove by the method of averaging due to Guo, Papanicolau and 
Varadhan that 


SS F(y)ny) — nly +2) 
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can be replaced by 
SoF@)QNe+1)* So n(e)(l-n(w+z)) 
y «:|x—y|<Ne 


in the limit when N —> oo and then ¢ > 0. This is called the principle of local 
averaging. From this fact, one is able to prove that the limiting probability 
distribution of the particles is concentrated on a set where the local density is 
non-random and satisfies Burger’s partial differential equation. 


Remarks: Let 


Te =(2h+1)* So ny) 


y:ly—2|<k 


Then, we have that if g(@) is smooth function on (0, 1], 


Nea eG (a/N)n 


xEZLNn 
can be replaced by 
NS o(a/N) Mix 


for fixed large k as N — oo. This is because 


IN SS g(a/N)n(z) — NS g(@/N) te 


=N'NS 0 g(2/N)n(z)- So n@2k+)7* SI gle/N)| 


x:|e—y|<k 


< NS" |g(y/N)—(2k +17? S> g(x/N)In(y) 


a:|e—y|<k 


Now for any € > 0, for fixed k, we have that for all N > No(e, k) 


supylg(y/N) — (2k +1)" S$ g(a/N)| <e 
n:|a—ylSk 


since a continuous function on a compact interval is also uniformly continuous. 
Thus, we get 


limnsooN~* $7 g(2/N)n(x) — N~* 0 g(@/N)itx = 0 


More generally, we can replace 


(1/N) }0 9(@/N)F (ten) 
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in view of the regularity of g on [0,1] by 


(1/N) SJ 9(e/N)(2k+1)-* YP F (tym) 


y:|y—2|<k 


and we can then replace (2k+1)~+ y:|y—2|<k (ty) by F((2k+1)7! Dy y—aice 7) 
if we assume that the n(a)’s are independent Bernoulli random variables with 
corresponding means p(a/N) and we define 


F(p) = E,F(n) 


where if 7 = (n(x):  € Zn), then E,F'(7) is the mean of F (n(x) : x € Zn) with 
n(a)’s being independent Bernoulli with means p(x)’s. This is an elementary 
consequence of the law of large numbers. It is valid provided that we take 
k sufficiently large. In proving hydrodynamic scaling limits, the kind of F'(n) 
that we encounter is typically F'(7) = (0)(1 — n(1)) which results in F(7,7) = 
n(x)(1 — n(@ + 1)) and then 


(1/N) $7 9(x/N)n()(1 — n(@ + 1)) 


can be replaced by 
(1/N) $0 g(x/N)p(x/N)(1 — p((a + 1)/N)) 


from which the hydrodynamical scaling limit for nearest neighbour interactions 
can be derived. More generally, if the jump probabilities p(z) have finite range, 


then 
(1/N) S$ g(@/N)p(z)n(a)(1 — n(@ + 2)) (a) 


can be represented by 


(1/N) $7 9(a/N)F (te) 


where 


F(n) = © p(z)n(0)(1 = n(z)) 
and then we get that (a) can be replaced by 
(1/N) 7 9(@/N)p(z)e(@/N)(1 — p((a + 2)/N) 


which immediately leads to the hydrodynamical scaling limit which is clearly 
linear in the symmetric case, ie, when p(z) = p(—z). 


[35] Some historical remarks: The RLS lattice algorithm involving compu- 
tation of the prediction filter coefficients recursively both in order and time was 
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first accomplished by the Greek engineers ” Carayannis,Manolakis and Kaloupt- 
sidis” in a pioneering paper published in the the IEEE transactions on Signal 
processing sometime during the eighties. It was polished further and stability 
analysis carried out by the team of Thomas Kailath, an Indo-American engi- 
neer. Thomas Kailath and his team members also formulated the RLS lattice 
algorithm for multidimensional time series models. The final algorithm for the 
RLS lattice algorithm for multivariate time series with application to cyclosta- 
tionary process prediction was carried out by Mrityunjoy Chakraborty in his 
Ph.D thesis. A.Paulraj invented the ESPRIT algorithm for estimating the fre- 
quencies in a noisy harmonic process based on rotationally invariant techniques. 
This work appeared in the IEEE transactions on signal processing sometime in 
the mid eighties along with Richard Roy and Thomas Kailath. The idea was 
a computationally efficient high resolution eigensubspace based algorithm for 
estimating the sinusoidal frequencies or equivalently the directions of arrival of 
multiple plane wave signals. It revolutionized the entire defence and astronomy 
industry. Later on higher dimensional versions of the MUSIC and the ESPRIT 
algorithm were invented by the author along in his PhD thesis along with his 
supervisors. These contributions are present in the papers 

[1] Harish Parthasarathy, S.Prasad and $.D.Joshi, ”A MUSIC like method 
for bispectrum estimation”, Signal Processing, Elsevier, North Holland, 1994. 

[2] Harish Parthasarathy, S.Prasad and $.D.Joshi, ”An ESPRIT like algo- 
rithm for quadaratic phase coupling estimation”, IEEE transactions on Signal 
Processing, 1995. 

The scientific contributions of Subramaniyam Chandrasekhar: 

Chandraskehar as college student, in the early 1940’s discovered a remark- 
able consequence of combining relativity, gravitation and quantum mechanics:by 
applying ideas from these newly developed fields to study the equilibrium of a 
star that has exhausted it fuel and is contracting under the influence of grav- 
itation with a counter repulsive force caused by the pressure exerted by the 
degenerate electron gas within the star owing to the Paul exclusion principle. 
In this way, Chandraskehar arrived at a fundmental limiting radius of a star 
that has exhausted all its fuel. Such a star is called a white dwarf and Chan- 
drasekhar’s calculations showed that this limiting radius is around 1.5 times the 
mass of the sun. The idea behind Chandraskehar’s calculation can be under- 
stood as follows. All the electrons in the star after its fuel has been exhausted 
have energies below the Fermi level and hence if n(p) denotes the number of 
electrons per unit volume in phase space and m, the rest mass of an electron, 
then the pressure of this degenerate electron gas is from the basic formula for 
the energy-momentum tensor 


Tee = Ne (PEP EO (aig) 


a 


given by (note that f 7“”d‘x is Lorentz invariant) 


P= : ane, MI? BO) 


Advanced Probability and Statistics: Remarks and Problems 199 


Noting that n(p) = 1/h, we get 
V2me Lr 
P= | p’.Anp*dp/./m2c! + pc? = P(Ep) 
0 


say. On the other hand, the volume density of this degenerate electron gas is 
given by 


J mc + p2c?.4np*dp = p(Epr) 


Chandraskehar then eliminated Ey between these to equations and obtained 
the equation of state P = P(p) of this degenerate electron gas. He showed that 
this equation could well be approximated by a ” polytrope”: 


P=Cp",7%4/3 


| ae 


= Cc n 3 = ce 
p= i: samen LOVE MOE? ! 


where the constant C’ is a function of the fundamental constants c,h, me. 


On the influence of group theory and group representation theory developed 
by Harish-Chandra on problems in statistical image processing. 

On the influence of the book on Lie groups, Lie algebra and their represen- 
tations by V.S.Varadarajan on robotics. 

Varadarajan in his book develops all the analytic and algebraic tool required 
for constructing all the irreducible representations of a complex semisimple Lie 
algebra/Lie group using the method of the quotient of universal enveloping al- 
gebra by a maximal ideal as first developed by Harish-Chandra in his celebrated 
paper ” On some applications of the universal enveloping algebra of a semisimple 
Lie algebra.” Prior to this topic, Varadarajan discusses all the other important 
analytic tools in Lie group theory like the differential of the exponential map, 
the Baker-Campbell-Hausdorff formula for the product of the exponential of two 
Lie algebra elements (for example in the matrix/linear Lie group case, the prod- 
uct of the exponential of two matrices). These ideas play a fundamental role in 
robotics where for example, the robot comprises of connected three dimensional 
links, each of which can undergo translation and rotation. Thus, the configura- 
tion space of the entire robot is the direct product of n elements of a subgroup 
of the full three dimensional Euclidean motion group. Varadarajan then gives a 
rigorous proof of Weyl’s celebrated character formula for all compact semisim- 
ple Lie groups in terms of the dominant integral weight of the representation 
first introduced by Elie Cartan. This idea is fundamental and can be applied to 
problems of image processing as follows. Suppose that we are given a manifold 
M on which the compact group G acts. An element of G transforms an image 
field f : M — C into another image field fi(x) = f(g~'x),a € M. Now we 
wish to associate an invariant functional” of the image field that will yield the 
same number for f as well as fj; no matter what the element g of G is. This 
is possible by taking an irreducible character x(g) of G and constructing the 
functional 


If) =f x(01k)f(kao)f(hivo)aha 
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For example then 


iis a XR) to) f(g) 


=f xia) f (deco) f(havo)ahdk = IC) 
GxG 


So for each irreducible representation of G we have an associated invariant func- 
tional and Weyl’s character formula combined with Weyl’s integration formula 
(which reduces an integral of a function on the group to an integral on a Cartan 
subgroup of the orbital integral of the function. 


On the influence of the book ” Perturbation theory for linear operators” by 
Tosio Kato on developments in robotics and quantum mechanics. Kato single- 
handedly created the notion of relative boundedness and relative compactness 
of an operator w.r.t. another in Banach and Hilbert spaces and applied it to a 
systematic study of unbounded operators in Hilbert space and also to obtain re- 
sults like when the perturbation of an unbounded closed opearator by another 
one remains closed. This notion is also known as stability of closedness. He 
also applied the notion of relative boundedness to answer the question of when 
the perturbation of an unbounded self-adjoint operator by a closed symmetric 
operator remains bounded. His book is a masterpiece as the only analytic tool 
that he uses in answering most of such questions about unbounded operators 
is based on the resolvent of the operator and its integral around a closed curve 
in the complex plane after multiplying it by a function of the complex variable. 
Kato’s book is applicable to a variety of problems in quantum mechanics such 
as obtaining bounds on the charge of the electron that would guarantee that 
the Hamiltonian of the two electron atom is self-adjoint. 


On the influence of the papers of Albert Einstein on developments in modern 
cosmology, evolution of inhomogeneities in the expanding universe and in the 
unification of gravity with quantum mechanics. 

On the influence of the books of Steven Weinberg on modern quantum field 
theory especially in the problem of electroweak unification, ie, unification of the 
weak nuclear forces with electromagnetism. 

The volumes start with the canonical quantization of the Klein-Gordon field 
based on transforming the Lagrangian density of the field to a Hamiltonian 
density and introducing canonical commutation relations between the position 
and momentum fields (equal time commutation relations) and then by means 
of a spatial Fourier decomposition of this field in terms of creation and anni- 
hilation operator fields in momentum space, the author derives the canonical 
commutation relations for these operators. Prior to this, the author introduces 
the notion of a wave function in second quantized Hilbert corresponding to a 
definite four momentum and spin/helicity space and how a unitary representa- 
tion of the Lorentz group acts on such a wave function. The action of such a 
representation turns out from natural principles to have two components, one a 
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Lorentz transformation of the four momentum of the wave function and two a 
little group element acting on the helicity variables. The author then presents 
the basic elements of quantum scattering theory. This presentation is based on 
assuming that the free particle state evolving under the free projectile Hamil- 
tonian first interacts with the scattering centre causing it to get scattered to an 
in-scattered state evolving under the free particle Hamiltonian plus an interac- 
tion potential in such a way that in the remote past, both the free particle and 
the interacting particle wave functions coincide. This gives rise to the notion 
of a wave operator. Likewise the particle gets scattered from the in scattered 
state to an out scattered state and then gets scattered to a free particle state as 
time goes to infinity in such a way that the out scattered state evolving under 
the interaction Hamiltonian coincides as time goes to infinity to a free parti- 
cle out state evolving under the free particle Hamiltonian. Thus we get two 
kinds of wave operators and one constructs the scattering matrix using these 
two operators. Weinberg with a remarkable physical insight skips over rigorous 
mathematical details involving the various notions of operator convergence and 
directly gives us the relationship between the inparticle free state and the cor- 
responding inparticle scattered state in terms of the interaction potential and 
the spectrum of the free particle Hamiltonian. Likewise, he tells us the cor- 
responding relationship between the out free particle and out scattered states. 
These relationships can be made mathematically rigorous using the spectral 
theorem and spectral measures for unbounded operators in Hilbert space and 
are known as the Lippmann-Schwinger equations. See for example the classic 
”Perturbation theory for linear operators” by Tosio Kato. 


Salam and Strathdhee introduced the notion of a super-vector field as a 
first order differential operator in the Bosonic and Fermionic coordinates. They 
used such supervector field to define infinitesimal transformations of superfields 
under a supersymmetry transformation. Later on, researchers in quantum field 
theory showed how to write down various kinds of supersymmetric actions, ie, 
action functionals that remain invariant under supersymmetry transformations 
apart from remaining invariant under also Lorentz and gauge transformations. 
These later developments are marvellously described in the third volume on 
*Supersymmetry” by Steven Weinberg. 


[36] Remarks about supergravity 

This section of the book has been adapted from celebrated volumes of 
Michael Green, John Schwarz and Edward Witten on Superstring Theory. Some 
part has also been taken from Weinberg, Vol.III. 

The most general form of a supergravity action is that it contains apart from 
the Bosonic part involving the Einstein Hilbert action functional of the tetrad 
field, Fermionic parts involving the gravitino. The gravitino is the superpartner 
of the graviton. The tetrad or Vierbein field describes the gravition while the 
gravitino field has both a spinor index and a vector index and is therefore a par- 
ticle of spin 3/2. The supergravity field also contains a Yang-Mills gauge field 
component. This gauge field is a Bosonic field and therefore the action should 
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also contain the super-partner of this Yang Mills field namely a Fermionic spin 
1/2 field which is the usual matter field action of the Dirac field generalized to 
the Yang-Mills gauge group. The whole action must be invariant under local 
supersymmetry transformations apart from being invariant under local Lorentz 
transformations, diffeomorphisms of space-time and gauge transformations and 
it remains therefore to define such local supersymmetry transformations in such 
a way that the entire set of all the above mentioned four kinds of transforma- 
tions closes on itself, ie, forms a Lie algebra of transformations. The starting 
point for constructing an appropriate action for supergravity is the equation of 
supercurrent conservation. 

Following Green, Schwarz and Witten, we write a simplified form of the 
supergravity Lagrangian without involving the Yang Mills matter and gauge 
terms as 

Lela B+ Xl"? Dy Xp! 


where 


and 


Note that x, or equivalently y_ is a Majorana Fermion which means that 
Xp = ya = xT Te 


so that 


[x wT yD? 


x =X 
is the adjoint (ie, conjugate transpose) of x,. Under the local supersymmetry 
transformation defined by the Majorana Fermionic parameter field e(x), we have 


OXp = Due(x), det, = T°X p(x) 


un 


and the transformation law of w;’'” is determined by the field equation that it 


satisfies, ie, 
— Ou (Omen) wr Ow” (Omen) wt ([dup, wn] + [wm dur} Jem en 


+X pl eYP Tn X pdw, ” = 0 
Remark: 
[Wp wr] Dinn = wht we” [Pgs Drs] 
where 
[Vpgs Drs] =P pstqr + TV artps — Vprtgs — Vqstpr 
from which we deduce that 


medi gt 


mn 
[wy Wr] — [wr Nar 
pm, ns mq, ,ns 
TW, Wy Nps — Ww, Wy Ns 
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ue tor | 
= 4a? wr Nog 
This equation implies that for all variations dw/?”, the following equation is 
satisfied: 
00 (ech) y “Pow (ere dy OU Npgen en + Xpl YP mnxpdwy” = 0 


mn 


Thus, w7/” satisfies the algebraic field equation: 
=e + (eee = Quh4 Nnq€hn ‘aa xT Cee Xo = =0 


It is an easy matter to solve this algebraic field equation for w/?” 


Now the quadratic part in w/’” when the action is varied under the above 


supersymmetry transformation is 
mp, nv mn Tp vp, rs 
Ae OL Nog PO, ole) Taal eer wr Xp 


+xh Tse. Pee TD sPmnWy Wy €0(Z) 


[37] One of the main achievements in the work of C.R.Rao was the proof of 
the lower bound on the error covariance matrix of a statistical estimator of a 
vector valued parameter based on vector valued observations using techniques 
of matrix theory. C.R.Rao in his work has also considered the case when the 
Fisher information matrix is singular and in this case, he has been able to 
use the methods of generalized inverses to obtain new formulas for the lower 
bound. The lower bound on the variance of an estimator should be compared 
to the Heisenberg uncertainty principle in quantum mechanics for two non- 
commuting observables. In fact, it can be shown that the Heisenberg uncertainty 
inequality for position and momentum can be derived using the CRLB. The 
CRLB roughly tells us that no matter how much we may try, we can never 
achieve complete accuracy in our estimation process, ie, there is inherently some 
amount of uncertainty about the system that generates a random observation. 

Remarks about efficiency of a statistical estimator of a parameter 
using CRLB 

Let X € R” be the vector valued measurement and 6 € R? be the parameter 
to be estimated. The pdf of the measurement is p(X|@). Let 6(X) be any 
estimator of 6 based on the measurement X. The bias of this estimator is 


B(0) = E(6(X)) —0 


which can be alternately expressed as 


/ (A(X) — é)p(X|0)aX = BA) 
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or in component form, 
[(Go(x) — 0q)p(X|0)dX = Ba(0),a = 1,2,...,p 


Differentiating both sides of this expression w.r.t 6) and using [ p(X|0)dX = 1 
gives 


Ban + f Ba(X) ~ 85) (Op(X|0)/00)AX = OBa(0)/00, 


This equation is multiplied by ugvpy and summed over a,b = 1, 2,...,p to get 


—ul vy + Elu™ (6(X) — 0).Voln(p(X|0)7 v] = u7 B’(0)v 


or equivalently, 


pfu? (0(X) — 8).(Voln(p(X|8)))"v] = u? (I + BY(8))o 


from which we deduce using the Cauchy-Schwarz inequality, 


(uT (I+ B'(6))v)? < El(u™(6(X) — 8))?].E[(v7 Volnp(X|8)7] 


where 


and 


J = E(Volnp(X|).(Volnp(X|9))) 
Cee) nA) 

“00, ° 00; 
Note that if 6(X) is an unbiased estimator of 0, then B(#) = 0 and 


I 


R= Cov(6(X)) =C 
and the above inequality reduces to 
(ul'v)? < (uP Cu)(v7 Jv) 
In the general case, consider 
(u? Gv)? < (u7 Ru)(v" Jv), G = 1 + B’(6), B’(0) = VoB(0) 


Substitute into this 
v= Au 


to get 
(ul GAu)? < (u7 Ru).(u7 AT J Au) 


This inequality holds for any p x p real matrix A. Let in particular, 


A= 4Gt 
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to get 
(u?GJI-1G?Pu)? < (u? Ru).(u?’ GJ-'GTu) 


or equivalently, 
we OT Gu <a Ba 


or equivalently, 
ul (R—GJ~'G*)u > Oforallu € R? 


In other words, 
R>GJ'G 


in the sense that the difference between the LHS and the RHS is a positive 
semidefinite matrix. In particular, if we allow u to run over an orthonormal 
basis for R? and add all the inequalities so obtained, we get 


Tr(R) > Tr(GJ-'G*) 


which is the same as saying that 


[|| 6(X) — 6 |? > Tr(GJ-"G?) 


The lhs in this inequality is the mean square estimation error. In the special 
case, when the estimator is unbiased, this inequality reads 


Var(6(X)) > Tr(J~!) 


Remark: There is an alternate expression for the Fisher information matrix: 


2 / p(inp) (inp) sAX = / pa(inp) gaX 
= 0; f plinp),;aX — f plinp) aX 


= 6; f pax - E(Inp) a5 


where 0;, Z,; etc, denote differentiation w.r.t 6; and not w.r.t X;. Thus, 


Olnp(X|6) 
06,00; 


Jig = —E| 


